Developer Guides
Step-by-step tutorials for working with TradeXil datasets
Understanding Dataset Formats
Parquet Format
RecommendedAdvantages
- 10-100x smaller file size - Efficient columnar compression
- Much faster to read - Only load columns you need
- Preserves data types - No type conversion needed
- Better for large datasets - Handles millions of rows efficiently
- Industry standard - Used by Pandas, Spark, Arrow
Example
btcusdt_1d_2020-2024.parquet
- ~15 MB for 1,800 candles with 200+ features
JSON Format
AlternativeAdvantages
- Human-readable - Easy to inspect in text editors
- Universal support - Works in any language
- Simple structure - No special libraries needed
- Web-friendly - Native JavaScript support
Disadvantages
- Much larger file sizes (~150 MB for same dataset)
- Slower to parse for large datasets
- Must load entire file into memory
Which Format Should I Use?
Use Parquet if:
- You're working with Python/Pandas, Spark, or other data science tools
- You need to analyze large datasets efficiently
- You want to minimize storage and download time
- You're building production trading systems
Use JSON if:
- You need to inspect data manually in a text editor
- You're using a language without Parquet support
- You're working with small datasets (< 1,000 rows)
- You need web browser compatibility
Available Guides
Click on a guide below to view detailed step-by-step instructions.
Getting Started with TradeXil Datasets
Complete setup guide for Windows and Linux users. Install Python, set up your environment, and run your first analysis.
Converting Between Formats
Learn how to convert datasets between JSON and Parquet formats using Python with OS-specific commands.
Viewing Parquet File Data
Multiple ways to inspect and explore Parquet files without converting them. Includes Windows and Linux terminal commands.