Note that my expectation would be that the non-boxed form would be as trivial as adding `[NonBoxedUnion(SomeImplStrategyChoiceEnum)]` (or `[NonBoxedUnion]` for some default strategy choices that likely are ok).
This would give you extremely fine grained flexible choice on how you wanted your non-boxing union to work. There's no single right answer. There are just tradeoffs in terms of space/speed/copying-costs/memory-safety/etc.
I think it would make the most sense as people who care about boxing will have very different views and needs in terms of things like space, casting costs, copying speed etc.
The vast vast majority of users do not need to care at all. And for that, a boxed approach works exceptionally well.
You're correct. The unions we're working on right now are 'type unions'. So the type is inherent in the union distinction, and you would not be able to distinguish that case. That said, we're also looking at full blown discriminated unions (you can look at one of my proposals for that here: https://github.com/dotnet/csharplang/blob/main/meetings/work...), which would allow for that. Syntax entirely tbd, but you'd do something like:
enum struct Either<T1, T2> // or enum class
{
First(T1 value),
Second(T2 value)
}
We view these features as complimentary. Indeed, if you look at the extended enum proposal, you'll see it builds on top of unions and closed types (another proposal coming in the next version of the lang).
Contrary to what a lot of people guess, boxing is actually a really good strategy most of the time. And, is indeed what many people are doing here anyways. The design supports a pattern that allows for non-boxing, and I expect that we will both supply an implementation for that with reasonable defaults, and that source generators will be a great way to augment this to get highly specialized impl strategies for non-boxing depending on the varying domain needs any specialized customer may have.
Hi there! C# language designer here, and one of the people working on unions.
Boxing is not something inherently to be avoided. It actually can work better in many (most?) use cases, and avoids a lot of problems that non-boxing approaches often cause (like tearing and copy costs).
It's try that the non boxing pattern could be implemented by us. And it's very reasonable that that is something we may do post this release. However, it's a non-trivial area. There's no one correct 'non-boxed' implementation. For example, do you have separate fields for all your unmanaged data? or do you have a blob of bytes that is large enough to align all your unmanaged data from teh largest set of of unmanaged fields, and you unsafe index into that?
Similar question for managed data. Do you have strongly typed fields for that data? Or do you attempt to use objects, to compact to as little space as possible? The former avoids casting costs. The latter allows you to minimize space. You can also potentially use unsafe casts. But those might introduce memory holes in tearing situations. etc. etc.
Because of this, i think the best outcome is to define the pattern (which we've done) and then use generators to allow you to control precisely the impl strategy, giving you all the bells and knobs you want to best fit your domain.
Yes. We do. And unions should work well in F#. It's designed to be a very easy pattern for all CLR languages and compilers to understand (including F#).
> I guess I overdramatized the situation a bit :) It's a passionate topic for me; as somebody who has been using C# at work for 10 years now, I'm just not happy with the direction the language has been taking.
You should come engage with us on this then :)
We do all our design in the open on github. And a lot of us are available to chat and discuss all this stuff in Discord and the like :)
> C# is not using its budget very wisely in my opinion.
I can promise you. Every feature you think are great had similar detractors over the years. EverySingleOne :)
intuitive is definitely in the eye of the beholder. When people saw:
`HashSet<string> people = [with(StringComparer.CaseInsensitiveComparer), .. group1, group2]`
they found it understandable. And this was also much nicer than what they'd have to write today (which would bring them out of the nice declarative collection-expression space).
Does that make it 'necessary'? Ultimately that's up to the individual. We felt like it was. Not being able to do simple things like this felt like a 'bitter pill'. Customization of collection construction is common (looking in codebases, it shows up about 7% of the time). So having to 'fall out' from the uniform collection-expr system into the much more verbose and clunky forms just for this common enough case felt 'necessary' to us.
>But I feel that there has to be a direction, things have to work together to make a language feel coherent.
I feel like this is conflicting feedback. Collection expressions made the language more coherent. Instead of 7 different ways of doing things (some of which were genuinely not efficient), we gave one uniform way of doing it. That makes things more coherent. Making it so you don't have to drop out of that for something as simple as configuring the collection makes things more coherent.
As someone who has been coding C# since the pre-generics days, this is the first syntax change which I strongly disagree with. I pretty much love every little bit of syntactic sugar you guys have added to the language. But this? This seems objectively illogical and just straight up ugly. It blows my mind that this is making it into the language, and it makes me worry about the future of C#.
Just guessing here: Putting something at index 0 that has nothing to do with the content.
I tend to agree, but I also didn’t try it.
I also think the 7% number is wrong. 0.5% seems more realistic. It’s just that you are not able to see the majority of the code. Not everything is checked in or on GitHub.
There isn't any such pressure. These features only happen because someone goes out of their normal job space to push for the necessity for them. All of the design team have full time work on other things. The design and impl only happens if the whole team can be convinced that it is important and worth investing in. Note that a lot of that convincing goes from the tons of feedback we get everywhere. This is anywhere from github, to partners (first, second, third), to conferences, forums, hacker news etc. etc. etc. We have tons coming in constantly. We pick these items up and spend this time on it precisely because we've seen the problems, and how it is affecting the ecosystem, and our future goals there, and we think it is then worthwhile.
I understand you feel this is ilke `!!`. We do not. We think being able to amke a dictionary, and pass in a custom comparer is deeply important. Analyzing code out there, we find that this happens in anywhere from 5-10% of all dicts. That is a ton of codebases and users impacted, and we've already heard from many of them about the friction this causes. Simply discarding that group greatly undercuts one of the core value props that collection expressions brings. A uniform and simple syntax that should suffice for nearly all collection needs.
You may feel differently. That's life in the design world :)
Practically speaking, I've found that Claude never uses collection expressions, so the feature has disappeared from my code. Before AI, the feature was looked at with skepticism by my coworkers. We like writing "var" for all variable declarations. You have to write the type on the left side if you want to declare a variable with a collection expression, and we would never do that otherwise. Can't do `foreach (var x in [1, 2, 3])`. Too often, you have to make specific accommodations in your code to allow the collection expression to be valid.
Collection expressions today are more the sort of thing that a code poet or golfer can do to prettify their code than something a newbie can count on using. It's tough to explain "you can only use this when the collection type is implied in that spot" to a newbie. The value of the base feature is still unproven for me. I'm not sure I agree, without some convincing, that collection expressions made the language more coherent rather than doing https://xkcd.com/927.
> Collection expressions made the language more coherent. Instead of 7 different ways of doing things (some of which were genuinely not efficient), we gave one uniform way of doing it.
I see your point on this. My dislike comes from a mixture of "I don't like how it looks" and "this language already has tons of features".
In terms of looks, I wish it could be more coherent with existing syntax.
List<int> = new {1, 2, 3} and List<int> = {1, 2, 3} are obviously taken up by anonymous types and blocks themselves. Would something like
List<int> = new(capacity: 10)[1, 2, 3]
have been possible? It feels like a combination of target-typed new and the initialization syntax. It involves the "new" keyword, which everybody already associates with constructor calls. It's short. Obviously, I don't know if this even works, maybe there's a parsing issue there (aren't those the most annoying issues in language design haha).
> they found it understandable
Kind of in my experience. Me and the people I've shown this to can easily remember it, but we all agree that it doesn't look like obvious syntax to them. Those two things are quite different to me. Contrast this to something like target-typed new, which immediately made sense to the same people. One might argue that that's fine enough and maybe it is, but I think, the less I have to remember about a language's syntax, the better. I'm going to have to remember many many other things anyway, better keep my memory free for the details of SynchronizationContext and async flow :)
I'm obviously aware that you get tons of bikeshedding comments like this all the time, so I'm sure you've gone through this. But to me, this invented syntax would have been fine. I just don't like the one that actually got in.
Now, the necessity on the other hand: May just be the company I'm working at, but my personal experience has never been that this is a big issue. Sure, it's nice to not have to fall back to explicit initialization a few more times. But personally, this doesn't pass my threshold of "painful enough to warrant additional syntax".
That's the core of my issue: Most, maybe all, of the new features in the language are fine to me in isolation. I may bikeshed about the explicit syntax (see: this thread). But my main issue is that the sum of complexity in the language and the issues beginners have when learning it are steadily increasing. I see this all the time at work.
As you said, this is definitely subjective. And in the end, language design is a very subjective process and maybe C# just won't be for me in the long run. But I wish it would, because at its core I like it, and .NET, a lot. Which is why I will continue to speak for my (subjective) viewpoint.
Well, this turned into a bit of an incoherent rant. I appreciate you exposing yourself to the HN acid pit ;)
> Would something like `List<int> = new(capacity: 10)[1, 2, 3]` have been possible?
Great question. And our design docs, and discussion with the community cover this. The reason that was eliminated as an option (we considered several dozen possible syntaxes) was that this syntax was actively confusing and misleading for people (for several reasons). These include (in no particular order):
1. the use of 'new' indicating that a new value was being allocated. That's not necessarily the case with collection expressions. The compiler is free to be smart here and not allocate if it doesn't need to. `[1, 2, 3]` for example, being constants, can in some cases just point at a data segment in the program.
2. the use of 'new' indicating that a constructor is being called ('new' has always meant that). That's not necessarily the case with collection expressions. Many collection forms (interfaces, immutables, spans, etc) do not go through constructors. This was actively confusing for people.
3. That syntax is already legal. It's an implicit objet creation that is being indexed into.
4. There was strong feedback from many in the community (and the design group, and lots that we talked to) that having things outside the boundary of the `[ ... ]` syntax was actively confusing. One could not easily tell what hte collection was and what wasn't part of it. The idea is that the `[ ... ]` is "the actual value". You know where it starts, where it ends, and what it represents.
--
Of course, at the end of the day, absolutely none of this may sway you. That's why we have a design process and we go through so many options. There were dozens considered here and we had many axes we were trying to optimize for. Overal, this struck a balance of working nicely, and having no major problems going for it (unlike other options).
> I'm obviously aware that you get tons of bikeshedding comments like this all the time, so I'm sure you've gone through this.
Yup :)
Totally ok with us though.
> But personally, this doesn't pass my threshold of "painful enough to warrant additional syntax".
Sure. But that's why we look at the entire ecosystem. And we converse with people who have full codebases they haven't been able to move over because of the lack of this. And we look at the pain that this will cause esp when we get dictionary/key/value support. All of this motivated what was ultimately a tiny feature that cost very little to get in. It was medium bang for very low buck.
And that's worth explaining too. We are always working on some huge features. But they take up a ton of time and need tons of effort and runway. Small features like this are easy to slot in in gaps and help deal with papercuts and friction that are often annoying people.
Hi there! One of the C# language designers here, working on unions.
We're very interesting in this space. And we're referring to it as, unsurprisingly, 'anonymous unions' (since the ones we're delivering in C#15 are 'nominal' ones).
An unfortunate aspect of lang design is that if you do something in one version, and not another, that people think you don't want the other (not saying you think that! but some do :)). That's definitely not the case. We just like to break things over many versions so we can get the time to see how people feel about things and where are limited resources can be spent best next. We have wanted to explore the entire space of unions for a long time. Nominal unions. Anonymous unions. Discriminated unions. It's all of interest to us :)
Well, there is also the issue that some things get designed and then abandoned even thought some improvements were expected, dynamic typing from DLR, expression trees, for example.
Hi there! One of the C# language designers here, working on unions. And the author of that feature :D
So I'm happy to discuss the thinking here. It's not about saving keystrokes. It's about our decision that users shouldn't have 7 (yes 7) different ways of creating collections. They should just be able to target at least 99% of all cases where a collection is needed, with one simple and uniform syntax across all those cases.
When we created and introduced collection expressions, it was able to get close to that goal. But there were still cases left out, leaving people in the unenviable position of having to keep their code inconsistent.
This feature was tiny, and is really intended for those few percent of cases where you were stuck having to do things the much more complex way (see things like immutable builders as an example), just to do something simple, like adding an `IEqualityComparer<>`. This was also something that would become even more relevant as we add `k:v` support to our collections to be able to make dictionaries.
Hi there! One of the C# language designers here, working on unions. We're extremely interested in discriminated unions. A real problem is that there so much interest, with many varying proposals on how best to do them. It's a lot to go through, and we've found some of the best designs layer on standard unions. So we like this ordering to lay the foundation for discriminated unions to built on top of! :)
Hi there! One of the C# language designers here, working on unions. All the different options have tradeoffs. As an example, the non-boxing options tear, which can be problematic. And, we have a lot of experience implementing the simple, reference-type, approach for types that make a lot of sense to people, but then adding a lightweight, value-type version for people who care about that later. See tuples, as well as records.
I expect the same will old here. But given the former group is multiple orders of magnitude higher than the latter, we tend to design the language in that order accordingly.
Trust me, we're very intersted in the low-overhead space as well. But it will be for more advanced users that can accept the tradeoffs involved.
And, in the meantime, we're designing it in C#15 that you can always roll the perfect implementation for your use case, and still be thought of as a union from teh language.
Note that my expectation would be that the non-boxed form would be as trivial as adding `[NonBoxedUnion(SomeImplStrategyChoiceEnum)]` (or `[NonBoxedUnion]` for some default strategy choices that likely are ok).
This would give you extremely fine grained flexible choice on how you wanted your non-boxing union to work. There's no single right answer. There are just tradeoffs in terms of space/speed/copying-costs/memory-safety/etc.
I think it would make the most sense as people who care about boxing will have very different views and needs in terms of things like space, casting costs, copying speed etc.
The vast vast majority of users do not need to care at all. And for that, a boxed approach works exceptionally well.