Hacker Newsnew | past | comments | ask | show | jobs | submit | secondcoming's commentslogin

legal discovery process?


there's even an episode of The Office where this happens. https://www.youtube.com/watch?v=V3GbCByGltU&t=214s


But won't all those posix functions that take only `const char*` parameters need to be changed to be pointer/length?


No. For string literals, they already have a 0 appended, so no problem. For others, you'll need to malloc/copy/free.

It hasn't been much of an issue with decades of D code.


I had unattended-upgrades cripple our VMs


numpy is a python wrapper over a C library written by people who have ground those gears


Yes but not all of them

It would be easy to push complexity up at the level of Numpy/Pytorch/Tensorflow but it mostly gets hidden

(also a lot of it relies on LAPACK which is Fortran - which kinda works with SIMD better than C/C++)


I'm not sure what the argument is here?

These are in the standard library because someone proposed their inclusion.

They're fine for the majority of people who really don't want to roll their own data structures each time.

They're not compulsory to use, you're still free to roll your own.


> I'm not sure what the argument is here?

That std::hive will fit right in. Another container type you probably shouldn't use, draining precious maintenance resource from groups who have better things they could be doing.

> These are in the standard library because someone proposed their inclusion.

As with std::hive. Indeed the "unordered" containers, just like std::hive were repeatedly knocked back and eventually got in decades after they were obsolete. Persistence really does pay off in C++

> They're fine for the majority of people who really don't want to roll their own data structures each time.

Sure, doubtless std::hive is fine for that same majority of people.


In my limited experience with looking at autovectorisation compiler output, gcc is quite bad unless you hold its hand, and clang tries to autovectorise everything it sees.


The problem is more in language (or SIMD architecturally, depending on your POV). C semantics block too many of the necessary transformations that autovectorization would need to do.


"Claude, don't create any technical debt please"


i've been told that it's totally fine because once the codebase turns into spaghetti you can simply tell the agent to refactor it and then everything will be ok


I know this is a tongue-in-cheek response, but this brings me great pain. The spaghetti begins quickly, and your unit/functional tests won't help you unless you hammered out your module API seams before you even began. Oh, your abstractions are leaking? Your modules know too much about each other? Multiply the spaghetti!


the multiple layers of vibe, makes the dozen of code bases even harder to maintain.


all your GPUs are belong to us


Is this still true? New versions of protobuf allow codegen of `std::string_view` rather than `const std::string&` (which forces a copy) of `string` and `repeated byte` fields.

https://protobuf.dev/reference/cpp/string-view/


It allows avoiding allocations, but it doesn't allow using serialised data as a backing memory for an in-language type. Protobuf varints have to be decoded and written out somewhere. They cannot be lazily decoded efficiently either: order of fields in the serialised message is unspecified, hence it either need to iterate message over and over finding one on demand or build a map of offsets, which negates any wins zero-copy strives to achieve.


This is true but the relative overhead of this is highly dependent on the protobuf structure in one's schema. For example, fixed integer fields don't need to be decoded (including repeated fixed ints), and the main idea of the "zero copy" here is avoiding copying string and bytes fields. If your protobufs are mostly varints then yes they all have to be decoded, if your protobufs contain a lot of string/bytes data then most of the decoded overhead could be memory copies for this data rather than varint decoding.

In some message schemas even though this isn't truly zero copy it may be close to it in terms of actual overhead and CPU time, in other schemas it doesn't help at all.


The win could be only decoding the fields you actually care about, rather than all fields.

It's the same for any other high performance decoding of TLV formats (FIX in finance for instance).


Those field accessors take and return string_view but they still copy. The official C++ library always owns the data internally and never aliases except in one niche use case: the field type is Cord, the input is large and meets some other criteria, and the caller had used kParseWithAliasing, which is undocumented.

To a very close approximation you can say that the official protobuf C++ library always copies and owns strings.


Well that is very disappointing news.

Even the decoder makes a copy even though it's returning a string_view? What's the point then.

I can understand encoders having to make copies, but not in a decoder.


Google really dropped the ball with protobuf when they took so long to make them zero-copy. There are 3rd party implementations popping up now and a real risk of future wire-level incompatibilities across languages.


"zero copy" in this context just means that the contents of the input buffer are aliased to string fields in the decoded representation. This is a language-level feature and has nothing to do with the wire format.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: