################################### Architecture Overview ################################### The primary components of the Twi-XL API architecture are depicted below: .. figure:: ./img/architecture_overview.png :align: center :scale: 100 Figure - The Twi-XL Architecture twi-xl-python ------------------------ The `twi-xl-python` package is a Python library for interfacing with the Twi-XL API and download the query results. Twi-XL API ------------------------ The Twi-XL API is the interface to the Twi-XL functionality. This interface is responsible to translate the incoming requests - from the `twi-xl-python`_ library - to search tasks and return the results. Athena ------------------------ The `Athena `_ is the interactive query service that is used to analyze the `TwiNL` Twitter archive. TwiNL archive ------------------------ The `TwiNL` twitter archive is stored in an `Amazon S3 Bucket `_. The twitter messages are aggregated, partioned and compressed to reduce the total size and improve the search performance. Athena results ------------------------ The results of the Athena queries are stored in an `Amazon S3 bucket `_. These results are automatically downloaded by the `twi-xl-python`_ Twitter scraper ------------------------ The TwiXL scraper is responsible for collecting new Dutch tweets and store them in the `Raw tweets`_ bucket. Raw tweets ------------------------ The raw tweets are stored in an `Amazon S3 Bucket `_. Step Function workflow ------------------------ The `Step Function workflow `_ collects and compress the latest scraped tweets and are added to the `TwiNL archive`_. Every night this workflow is started.