PrestoToMysqlOperator, SlackOperator… you get the idea! Operators: DockerOperator, HiveOperator, S3FileTransformOperator, In addition to these basic building blocks, there are many more specific Sensor - waits for a certain time, file, database row, S3 key, etc….MySqlOperator, SqliteOperator, PostgresOperator, MsSqlOperator, OracleOperator, JdbcOperator, etc.SimpleHttpOperator - sends an HTTP request.PythonOperator - calls an arbitrary Python function.If it absolutely can’t be avoided,Īirflow does have a feature for operator cross-communication called XCom that isĪirflow provides operators for many common tasks, including: Share information, like a filename or small amount of data, you should considerĬombining them into a single operator. This is a subtle but very important point: in general, if two operators need to In fact, they may run on two completely different machines. The correct certain order other than those dependencies, operators generally The DAG will make sure that operators run in Not always) atomic, meaning they can stand on their own and don’t need to share While DAGs describe how to run a workflow, Operators determine whatĪn operator describes a single task in a workflow. In general, each one should correspond to a single You can have as many DAGs as you want, each describing anĪrbitrary number of tasks. Airflow will execute the code in each file to dynamically build Whatever they do happens at the right time, or in the right order, or with theĭAGs are defined in standard Python files that are placed in Airflow’sĭAG_FOLDER. The important thing is that the DAG isn’tĬoncerned with what its constituent tasks do its job is to make sure that Or perhaps A monitors your location so B can open your garage door whileĬ turns on your house lights. Maybe A prepares data for B to analyze while C sends anĮmail. Notice that we haven’t said anything about what we actually want to do! A, B,Īnd C could be anything. In this way, a DAG describes how you want to carry out your workflow but It might also say that the workflow will run every nightĪt 10pm, but shouldn’t start until a certain date. ItĬould say that task A times out after 5 minutes, and B can be restarted up to 5 Say that A has to run successfully before B can run, but C can run anytime. The tasks you want to run, organized in a way that reflects their relationshipsįor example, a simple DAG could consist of three tasks: A, B, and C. In Airflow, a DAG – or a Directed Acyclic Graph – is a collection of all
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |