Skip to main content

ML Data Pipelines

data-engineering

PandasNumPyDaskRAPIDS AIApache ArrowApache Parquet

Custom libraries for financial market data processing and storage

High-performance data pipelines for financial market data acquisition, preparation, and columnar storage. Processes 20+ years of trading and order book data with extensive feature engineering for technical indicators and statistical markers using Apache Arrow in-memory and Parquet on-disk formats.