Technology Choices
Why Rust?
ysv
is built to process tens of GBs of data fairly quickly - to prepare data for analysis, import into databases or cloud storage systems. Performance matters, and Rust, with its transportability and safety, seems to be a perfect fit for the job.
UNIX spirit
Being a command line utility, ysv
can be trivially integrated with other tools. For example, I am using it in scenarios like this:
Here, we download data from a remote location, unarchive the data, transform it, and import into a PostgreSQL database. The dataset may be huge, but we do not even store it on local disk: everything happens on-the-fly.
Configuration language
ysv
configuration format aims to feel as a very specialized, but purely functional and strongly typed programming language. Every transformation except I/O is a monad. You may imagine an invisible .map()
put between each two consecutive transformations.
xsv2schema
xsv2schema
is a tool written in Python and capable of generating stubs of ysv
config files based on CSV data. Can save from tedious work writing confgs, and provides an example of how easy it is to generate ysv
confgurations programmatically.
Future
Plans:
Last updated