More

chipx86 · 2026-05-26T08:51:58 1779785518

We're still using it for our stuff (via our @beanbag/spina TypeScript wrapper for Backbone that has a few additional niceties), and I feel the same way. We can be as close to the DOM as we need (or let something manage it for our view for us), we can be smart about where data lives and how things interact with it/listen for changes, and we get a decent little object model that doesn't dictate how we think about everything.

Very happy still with the Backbone way of building webapps.

chipx86 · 2025-12-05T09:31:32 1764927092

I'm pretty happy to see this, as conditionals can really help keep code manageable when trying to define CSS variables or other properties based on combinations of light mode, dark mode, high-contrast, contextual state in a document or component, etc.

if() isn't the only way to do this, though. We've been using a technique in Review Board that's roughly equivalent to if(), but compatible with any browser supporting CSS variables. It involves:

1. Defining your conditions based on selectors/media queries (say, a dark mode media selector, light mode, some data attribute on a component, etc.).

2. Defining a set of related CSS variables within those to mark which are TRUE (using an empty value) and which are FALSE (`initial`).

3. Using those CSS variables with fallback syntax to choose a value based on which is TRUE (using `var(--my-state, fallback)` syntax).

I wrote about it all here, with a handful of working examples: https://chipx86.blog/2025/08/08/what-if-using-conditional-cs...

Also includes a comparison between if() and this approach, so you can more easily get a sense of how they both work.

bradly · 2025-12-05T17:20:09 1764955209

I recently implemented dark/light mode for the first time and was really surprised to find that in order to add a toggle I had to duplicate a both of vars/styles and use JavaScript. I'm looking forward to not having to deal with that cruft in the future.

hebelehubele · 2025-12-05T20:35:53 1764966953

You can do without javascript. Checkbox `:checked` + `label` trick for toggling states still works.

See: https://codepen.io/abdus/pen/bNpQqXv

bradly · 2025-12-06T01:55:04 1764986104

Thanks, I’ll check this out!

hebelehubele · 2025-12-06T10:10:06 1765015806

Here's a version with a 3-way theme selector (light / system / dark) using radio inputs.

https://codepen.io/abdus/pen/yyOQpvE

chipx86 · 2025-10-12T06:41:52 1760251312

Love seeing this one. My uncle was co-founder of Quarterdeck, and I grew up in a world of DESQview and QEMM. It was a big influence on me as a child.

Got a good family story about that whole acquisition attempt, but I don't want to speak publicly on behalf of my uncle. I know we've talked at length about the what-ifs of that moment.

I do have a scattering of some neat Quarterdeck memorabilia I can share, though:

https://www.dropbox.com/scl/fo/0ca1omn2kwda9op5go34e/ACpO6bz...

abdulhaq · 2025-10-12T10:06:42 1760263602

I loved the Quarterdeck stuff especially Desqview

chipx86 · 2025-06-30T19:39:43 1751312383

This is a great feature for those who want a mix of positional and keyword-only arguments.

I should have mentioned originally (and I've since updated my post) that this and the kw_only= flag both require Python 3.10 and higher, so code that works with older versions can't opt into it yet.

chipx86 · 2025-06-30T07:34:59 1751268899

That's annoying for sure. Though a different problem.

All the kw_only=True argument for dataclasses does is require that you pass any fields you want to provide as keyword arguments instead of positional arguments when instantiating a dataclass. So:

    obj = MyDataclass(a=1, b=2, c=3)

Instead of:

    obj = MyDataclass(1, 2, 3)  # This would be an error with kw_only=True

The problem you're describing in boto3 (and a lot of other API bindings, and a lot of more layered Python code) is that methods often take in **kwargs and pass them down to a common function that's handling them. From the caller's perspective, **kwargs is a black box with no details on what's in there. Without a docstring or an understanding of the call chain, it's not helpful.

Python sort of has a fix for this now, which is to use a TypedDict to define all the possible values in the **kwargs, like so:

    from typing import TypedDict, Unpack


    class MyFuncKwargs(TypedDict):
        arg1: str
        arg2: str
        arg3: int | None


    def my_outer_func(
        **kwargs: Unpack[MyFuncKwargs],
    ) -> None:
        _my_inner_func(**kwargs)


    def _my_inner_func(
        *,
        arg1: str,
        arg2: str,
        arg3: int | None,
    ) -> None:
        ...

By defining a TypedDict and typing **kwargs, the IDE and docs can do a better job of showing what arguments the function really takes, and validating them.

Also useful when the function is just a wrapper around serializing **kwargs to JSON for an API, or something.

But this feature is far from free to use. The more functions you have, the more of these you need to create and maintain.

Ideally, a function could type **kwargs as something like:

    def my_outer_func(
        **kwargs: KwargsOf[_my_inner_func],
    ) -> None:
        ...

And then the IDEs and other tooling can just reference that function. This would help make the problem go away for many of the cases where **kwargs is used and passed around.

mvieira38 · 2025-06-30T14:25:50 1751293550

TypedDicts are so underutilized in general. I'm using them a lot even for simpler scripts

maleldil · 2025-06-30T23:51:23 1751327483

I don't see a point in using them in new code when I could just use a dataclass (or Pydantic in certain contexts). I've only found them useful when interfacing with older code that uses dicts for structured data.

chipx86 · 2025-06-30T06:22:21 1751264541

That's always the challenge when iterating on interfaces that other people depend on.

What we do is go through a deprecation phase. Our process is:

* We provide compatibility with the old signature for 2 major releases.

* We document the change and the timeline clearly in the docstring.

* The function gets decorated with a helper that checks the call, and if any keyword-only arguments are provided as positional, it warns and converts them to keyword-only.

* After 2 major releases, we move fully to the new signature.

We buit a Python library called housekeeping (https://github.com/beanbaginc/housekeeping) to help with this. One of the things it contains is a decorator called `@deprecate_non_keyword_only_args`, which takes a deprecation warning class and a function using the signature we're moving to. That decorator handles the check logic and generates a suitable, consistent deprecation message.

That normally looks like:

    @deprecate_non_keyword_only_args(MyDeprecationWarning)
    def my_func(*, a, b, c):
        ...

But this is a bit more tricky with dataclasses, since `__init__()` is generated automatically. Fortunately, it can be patched after the fact. A bit less clean, but doable.

So here's how we'd handle this case with dataclasses:

    from dataclasses import dataclass
    
    from housekeeping import BaseRemovedInWarning, deprecate_non_keyword_only_args
    
    
    class RemovedInMyProject20Warning(BaseRemovedInWarning):
        product = 'MyProject'
        version = '2.0'
    
    
    @dataclass(kw_only=True)
    class MyDataclass:
        a: int
        b: int
        c: str
    
    MyDataclass.__init__ = deprecate_non_keyword_only_args(
        RemovedInMyProject20Warning
    )(MyDataclass.__init__)

Call it with some positional arguments:

    dc = MyDataclass(1, 2, c='hi')

and you'd get:

    testdataclass.py:26: RemovedInMyProject20Warning: Positional arguments `a`, `b` must be passed as keyword arguments when calling `__main__.MyDataclass.__init__()`. Passing as positional arguments will be required in MyProject 2.0.
      dc = MyDataclass(1, 2, c='hi')

We'll probably add explicit dataclass support to this soon, since we're starting to move to kw_only=True for dataclasses.

codethief · 2025-06-30T07:16:29 1751267789

Shouldn't you also be able to patch MyDataclass in a class decorator (on top of/after @dataclass)?

chipx86 · 2025-06-30T07:24:29 1751268269

Yeah, that's the approach we'll be taking in housekeeping. I didn't want to complicate the example any more than I already did :)

chipx86 · 2025-06-04T08:09:40 1749024580

Generally-speaking, you probably shouldn't have to deal with these problems unless you're writing a tool that has to interface with certain SCMs or SCMs used in certain environments. I'll give you some examples for each of these points:

1. There are two important areas where encoding can matter: The filename and the diff content.

Git pays attention to filename encoding, but most SCMs don't, so when a diff is generated, it's based on the local encoding. If there are any non-ASCII characters in that filename, a diff generated in one environment with one encoding set can end up not applying to another (or, in our case, not being able to be looked up from a repository). This isn't common but it can happen (we've seen this on Perforce and Subversion).

Then there's the content. Many SCMs will actually give you a representation of a text file and not the raw contents itself. That text file will be re-encoded for your local/preferred encoding, and newlines may be adjusted as well (`\r\n`, `\n`). The text file is then re-encoded back when pushing the change. This allows people in different environments to operate on the same file regardless of what encoding they're working with.

This doesn't necessarily make its way into the diff, though. So when you send a diff from a less-common encoding to a tool to process it, and that tries to apply it to the file checked out with its encoding, it can fail to patch.

The solution is to either know the encoding of the file you're processing, or try to guess it (some tools, like ours, let you specify a list of preferred encodings to try).

It's best if you can know it up-front.

Bonus Fun Fact: On some SCMs (Perforce comes to mind), checking out a file on Windows and then diffing it Linux via a shared mount can get you a diff with `\r\r\n` newlines. It's a bad time and breaks patching. And it used to come up a lot, until we worked around it.

Also, Perforce for a while would sometimes normalize encodings incorrectly and you'd end up with BOMs in the diff, breaking GNU patch.

2. It does when you're working with them directly for applying and patching. If you're handing them off to a tool for processing, if there's any risk of one file in a sequence not being included, you can end up with breakages that maybe you don't see until later processing.

It's also just really nice having all the state and metadata up-front so we can process it in one go in a consistent way without having to sanity-check all the diffs against each other.

When working locally, it also depends on your tooling. `git format-patch` and `git am` are great, but are for Git. If I'm working with (let's just say) Subversion, I need to do my own thing or find another tool.

3. It's critical for the kind of information needed to locate files in a repository. Some systems need a commit-wide identifier. Some need per-file identifiers. Some need a combination of the two. Some need those plus additional data not otherwise represented in the path or revision (generally more enterprise SCMs targeting certain use cases).

It's also critical for representing information that isn't in the Unified Diff format (namely, anything but the filename). So, symlink information, file modes, SCM-specific properties on a file or directory, to name a few. This information needs to live somewhere if a SCM provides it, and it's up to every SCM to choose how and where to store that data (and then how it's encoded, etc.).

account42 · 2025-06-04T10:14:54 1749032094

> Then there's the content. Many SCMs will actually give you a representation of a text file and not the raw contents itself. That text file will be re-encoded for your local/preferred encoding, and newlines may be adjusted as well (`\r\n`, `\n`). The text file is then re-encoded back when pushing the change.

Yeah, don't do that.

> This allows people in different environments to operate on the same file regardless of what encoding they're working with.

No it causes hard to understand bugs because now what people see on their device and what is tracked in source control differs, defeating the entire purpose of having source control in the first place. This isn't theoretical at all btw.

> The solution is to either know the encoding of the file you're processing

In general, there is no such encoding - source control tools need to be able to deal with files not valid in any single encoding.

chipx86 · 2025-06-04T07:23:30 1749021810

We build a code review product that interfaces with over a dozen SCMs. In about 20 years of writing diff parsers, we've encountered all kinds of problems and limitations in SCM-generated diff files (which we have to process) that we wouldn't ever have expected to even consider thinking about before. This all comes from the pain points and lessons learned in that work, and has been a huge help in solving these for us.

These aren't problems end users should hopefully ever need to worry about, but they're problems that tools need to worry about and work around. Especially for SCMs that don't have a diff format of their own, have one that is missing data (in some, not all changes can be represented, e.g. deleted files), or don't include enough information for another tool to identify the file in a repository.

HelloNurse · 2025-06-04T07:55:03 1749023703

Better file formats cannot, by themselves, improve an inferior SCM tool that, for instance, processes files with the wrong text encoding or forgets deleted and renamed files: they would only have helped you for the purpose of developing your code review tool.

Standards are meant for interchange, like (as mentioned in other comments) producing a patch file by any means and having someone else apply it regardless of what they use for version control.

chipx86 · 2025-06-04T07:18:50 1749021530

difftastic is great!

This isn't a tool for viewing changes to files or to ASTs. This is a way of being able to generate a single diff file for processing or patching that addresses the kinds of problems we've encountered in over 20 years of building diff parsing tooling and working with over a dozen SCMs with varying levels of completeness or brokenness of bespoke custom diff formats.

It's not an end user tool, but a useful format for tools like code review products to use.

chipx86 · 2025-06-04T07:02:23 1749020543

In the early drafts, we played with a number of approaches for the structure. Things like "commit-meta", etc. In the end, we broke it down into `#<section_level><section_type>`, just to simplify the parsing requirements. Every meta block is a meta block, and knowing what section level you're supposed to be in and comparing to what section level you get become a matter of "count the dots".

The header formats are meant to be very simple key/value pairs that are known by the parser, and not free-form bits of metadata. That's what the "meta" blocks are for. The parsing rules for the header are intentionally very simple.

JSON was chosen after a lot of discussion between us and outside parties and after experimentation with other grammars. The header for a meta block can specify a format used to serialize the data, in case down the road something supplants JSON in a meaningful way. We didn't want to box ourselves in, but we also don't want to just let any format sit in there (as that brings us back to the format compatibility headaches we face today).

For the other notes:

1. Compatibility is a key factor here, so we'd want to go with base-level JSON. I'd prefer being able to have trailing commas in lists, but not enough to make life hard for someone implementing this without access to a JSON5 parser.

2. If your goal is to simply feed to GNU patch (or similar), you can still split it. This extra data is in the Unified Diff "garbage" areas, so they'll be ignored anyway (so long as they don't conflict, and we take care to ensure that in our recommendations on encoding).

If your goal is to split into two DiffX files, it does become more complicated in that you'd need to re-add the leading headers.

That said, not all diff formats used in the wild can be split and still retain all metadata. Mercurial diffs, for example, have a header that must be present at the top to indicate parent commit information. You can remove that and still feed to GNU patch, but Mercurial (or tools supporting the format) will no longer have the information on the parent commit.

3. Revisions depend heavily on the SCM. Some SCMs use a commit identifier. Some use per-file identifiers. Some use a combination of the two. Some use those plus additional information that either gets injected into the diff or needs to be known out-of-bounds. There's a wide variety of requirements here across the SCM landscape.

laserbeam · 2025-06-04T10:42:27 1749033747

> The header formats are meant to be very simple key/value pairs that are known by the parser, and not free-form bits of metadata. That's what the "meta" blocks are for.

One more thing you should prepare for whenever you have "free-form bits of metadata". They somehow turn into: "some user was storing 100MB blobs in there, and that broke our other thing".

laserbeam · 2025-06-04T08:13:37 1749024817

> 1. Compatibility is a key factor here, so we'd want to go with base-level JSON. I'd prefer being able to have trailing commas in lists, but not enough to make life hard for someone implementing this without access to a JSON5 parser.

This is what I was referring to. This is not json:

> #..meta: format=json, length=270

> The header formats are meant to be very simple key/value pairs that are known by the parser, and not free-form bits of metadata. That's what the "meta" blocks are for. The parsing rules for the header are intentionally very simple.

Exactly my point. That level of flexibility for a .patch format to support another language embedded in it is overwhelming. Keep in mind that you are proposing a textual format, not a binary format. So people will use 3rd party text parsing tools to play with it. And having 2 distinct languages in there makes that annoying and a pain.

hdjrudni · 2025-06-06T05:36:33 1749188193

How do they reasonable work around that though? If they want the ability to move away from JSON, you have to know that it is JSON before trying to parse it. And then you need to know how much data to read. So I can see why they put those 2 tidbits of info above data block.

Maybe they could have said too bad, JSON for life, we'll never change it. OK. But then you still need the length or a delimiter for the "end of json".

WhyNotHugo · 2025-06-04T15:58:10 1749052690

What was your reasoning for discarding the existing header format used by git?

quotemstr · 2025-06-04T07:16:34 1749021394

> Compatibility is a key factor here, so we'd want to go with base-level JSON. I'd prefer being able to have trailing commas in lists, but not enough to make life hard for someone implementing this without access to a JSON5 parser.

Everyone has access to a JSON5 parser. Everyone has to suffer for the sake of a few people who don't to pay the trifling tax of pip installing something --- when they're using an external library for a novel file format _anyway_?

genocidicbunny · 2025-06-04T08:03:47 1749024227

> Everyone has access to a JSON5 parser.

That's just a lack of imagination. When you're making a product for teams that span everything from a brand new startup using the latest tooling to teams that are working on software that runs on embedded systems from the 90's, you need to consider things like this.

roblabla · 2025-06-04T08:53:44 1749027224

There are json5 parsers written in C89 out there. And your embedded systems from the 90s probably doesn't have a JSON parser built in at all either... If you're going to build your own json parser, adding json5 support on top is really trivial.

genocidicbunny · 2025-06-04T09:26:34 1749029194

That doesn't mean it's not going to be difficult to use that parser. Not everyone has the luxury of being able to use third-party code, or having the time allotted to write a JSON5 parser. The JSON parser some places are using may have been written two decades ago and works well enough that there's little motivation to implement JSON5 support. Sometimes it's just company policy or internal politics that prevent the usage.

It's also just not that big a deal overall for the intended use of the DiffX format. It's mainly machine-generated and machine-consumed. There's human readability concerns for sure, but the format looks to be designed mainly for tools to create and consume, so missing a few features that JSON5 brings is not that big of a deal.

DannyBee · 2025-06-04T17:47:46 1749059266

"That doesn't mean it's not going to be difficult to use that parser. Not everyone has the luxury of being able to use third-party code, or having the time allotted to write a JSON5 parser."

Why are these people the target market?

I understand it may be important to you, but that isn't the same as "matters to target market/audience".

On top of that, the same constraints you mention here would stop you parsing current git patch formats, and lots of other things anyway. So you were never going to be using modern tools that might care here.

This is all also really meta. Who exactly is writing software with >1% market share, needs to parse the patch format, and can't access a JSON parser.

Instead of this theoretical discussion, let's have a concrete one.

genocidicbunny · 2025-06-04T19:36:28 1749065788

In this specific instance, those people are part of the target market because the project chooses to make them part of the target market. It's worked well enough for Review Board.

quotemstr · 2025-06-04T15:31:46 1749051106

So the whole world should suffer through vanilla JSON because someone, somewhere, has an overbearing and paranoid software approval process? That's the attitude the delayed universal unicode adoption by a decade.

genocidicbunny · 2025-06-04T19:31:44 1749065504

That's a bit dramatic. This isn't something as universal as Unicode. You really only need to care about this if you're writing tools that generate or consume the DiffX format, which is not something most people will be doing. The whole world isn't suffering their decision to use JSON instead of JSON5.

DannyBee · 2025-06-04T17:44:49 1749059089

I don't think this is true, and honestly, I think it would be a mistake to consider it - they can't serve everyone, down that path is madness. FWIW - I even have a JSON parser in my RTOS-that-must-run-in-less-than-512k.

I also think that target of "embedded systems from the 90's" makes no sense because the tooling for the embedded system, which is what would conceivably want to handle patch format, ran on the host, which easily had access to a JSON parser.

But let's assume it does matter - let's be super concrete - assume they want to serve 95-99% of the users of patch format (i doubt it's even that high).

Which exact pieces of software with even >1% market share that need to process patch format don't have access to a JSON parser?