What is a "data processing pipeline"?
I couldn't find a satisfying description on the web.
I don't understand how the data flows in the pipeline and hope someone could clarify what is going on there.
I thought a pipeline of commands processes files (text, arrays of strings) in line by line manner.
I have data stored on disk in files that are far too big to store in main memory.
I want to stream this data from the disk into a data processing pipeline via iconv, like this:
zcat myfile | iconv -f L1 -t UTF-8 | # rest of the pipeline goes here
Unfortunately, I'm seeing iconv buffer the entire file in memory until it's exhausted before outputting any data.