A real-world journey of how my small team at Xena Intelligence built robust data pipelines for our enterprise customers using Airflow. If you’re a data engineer, or part of a small team, this talk is for you. Learn how we orchestrated a complex workflow to process millions of public reviews.
What You’ll Learn:
-
Cost-Efficient DAG Designing: Decomposing complex processes into atomic tasks using the TaskFlow, XComs, Mapped tasks, and Task groups. Diving into one of our DAGs as a concrete example of how our approach optimizes parallelism, error handling, delivery speed, and reliability.
-
Integrating LLM Analysis: Explore how we integrated LLM-based analysis into our pipeline. Learn how we designed the database, queries, and ingestion to Postgres.
-
Extending Airflow UI: We developed a custom Airflow UI plugin that filters and visualizes DAG runs by customer, product, and marketplace, delivering clear insights for faster troubleshooting.
-
Leveraging Airflow REST API: Discover how we leveraged the API to trigger DAGs on demand, elevating the UX by tracking mapped DAG progress and computing ETAs.
-
CI/CD and Cost Management: Get practical tips for deploying DAGs with CI/CD.
Naseem Shah
Engineering at Xena Intelligence