API Reference

This part of the documentation covers all the interfaces of twitter.

Main interface

All of the twinls’ functionality can be accessed by the following functions and classes.

twixl.collections.twitter.search(query, start_time=None, end_time=None, max_results=None, api=None, callback=None)[source]

Sends a TwiXL search query.

Parameters:
  • query (Query) – A TwiXL query object that describes the search query.

  • start_time (Optional[datetime]) – The oldest UTC timestamp from which the Tweets will be provided.

  • end_time (Optional[datetime]) – The newest UTC timestamp from which the Tweets will be provided.

  • (Optional) (callback) – The maximum number of search results in query output. By default, all search results will be stored in the query output.

  • (Optional) – An initialized twixl Api object. If not provided, an object will be initialized.

  • (Optional) – Callback function taking a QueryStatus object.

Returns:

SearchReuslts object

Return type:

twitter.SearchResults

Usage:

>>> from twixl.collections import twitter
>>> query = (
>>>   twitter.Query().
>>>       .keywords(regex=['twitter', 'tweet'])
>>>       .urls(url_regex: ["twitter.com/twitter*"])
>>> )
>>> search_results = twitter.search(
>>>   query=query,
>>>   start_time=datetime.datetime(2022, 1, 1, 0, 0),
>>>   end_time=datetime.datetime(2022, 1, 1, 23, 59, 59),
>>>   max_results=100,
>>>   callback=twitter.print_callback
>>> )
Query status: RUNNING (0 Bytes scanned)
Query status: DOWNLOADING_RESULTS (1 GB scanned)
>>> SearchResults
SearchResults(...)
twixl.collections.twitter.tweet_metrics(api=None)[source]

Sends a get metrics request to the TwiXL Api and returns the results.

Parameters:

(Optional) (api) – An initialized twixl Api object. If not provided, an object will be initialized.

Returns:

TweetMetrics object

Return type:

twitter.TweetMetrics

Usage:

>>> from twixl.collections import twitter
>>> metrics = twitter.tweet_metrics()
>>> merics.to_pandas()
enum twixl.collections.twitter.dataset(value)[source]

Twitter datasets available on twi-xl.

Valid values are as follows:

ALL = <dataset.ALL: '/tweets/search/all'>
TWINL = <dataset.TWINL: '/tweets/search/twinl'>
POLITICS = <dataset.POLITICS: '/tweets/search/politics'>
class twixl.collections.twitter.Query(dataset=dataset.ALL)[source]

A TwiXL query object to define a search query.

from_userids(userids, overwrite=False)[source]

Add a query statement that matches any tweet from the specific userid(s) to the TwiXL.Query object.

Parameters:

userids (List[str]) – A list of user’s numeric user ID’s

Return type:

Query

Returns:

Query object

Rtpe:

twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().from_userids(userids=['twitterdev', 'twitterapi'])
from_usernames(usernames, overwrite=False)[source]

Add a query statement that matches any tweet from the specific username(s) to the TwiXL.Query object.

Parameters:

usernames (List[str]) – A list of usernames (excluding the @ character)

Return type:

Query

Returns:

Query object

Rtpe:

twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().from_usernames(usernames=['twitterdev', 'twitterapi'])
keywords(keywords)[source]

Add a query statement that matches any tweet with the specified keyword(s).

Parameters:

keywords (List[str]) – A list of words

Return type:

Query

Returns:

Query object

Rtpe:

twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().keywords(keywords=['twitter', 'tweet'])
print()[source]

Print query.

Return type:

None

regex(regex)[source]

Add a query statement that matches any tweet which match any of the specified regular expression(s).

Parameters:

regex (List[str]) – A list of regular expressions

Return type:

Query

Returns:

Query object

Rtpe:

twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().regex(regex=['\btwit\w+'])
to_dict()[source]

Returns the TwiXL query as a dict.

Returns:

Query object as dictionary.

Return type:

dict

url(url_regex, must_not=False)[source]

Add a query statement that matches any tweet that contains one of the specified URLs to the TwiXL.Query object.

Parameters:
  • url – List of url’s or regular expressions

  • must_not (bool) – If set to True, the list of url’s must not appear in the final results

Returns:

Query object

Return type:

twitter.Query

Usage:

>>> from twixl.collections import Query
>>> Query().url(url_regex: ["twitter.com/twitter*"])
twixl.collections.twitter.API(api_endpoint, api_key)[source]

Plotting functons

This section of the documentation covers all the plotting functions.

twixl.collections.twitter.plotting.plot_tweet_frequencies(tweets, num_xticks=5, title='Number of tweets per day')[source]

Plot the TwiXL Query result in a frequency plot.

Parameters:
  • search_results – Twitter search results.

  • num_xticks (int) –

  • (Optional) (title) – The figure title.

Return type:

Tuple[Figure, Axes]

Returns:

Tweet frequency figure.

Usage:

>>> from twixl.collections import twitter
>>> twitter.plotting.plot_tweet_frequencies(
>>>   tweets,
>>>   title="Number of 'Elfstedentocht' tweets per day"
>>> )
(<Figure>, <AxesSubplot>)
twixl.collections.twitter.plotting.plot_word_cloud(frequencies, width=800, height=400, max_words=200, stopwords=None, background_color='white', min_word_length=0)[source]

Plots the word-frequency list as a wordcloud.

Parameters:
  • frequencies (Series) – item frequencies as generated by any of the x_frequencies() methods.

  • width (int) – Width of the canvas.

  • height (int) – Height of the canvas.

  • max_words (int) – The maximum number of words in the wordcloud.

  • stopwords (Optional[List[str]]) – A list of stopwords that should be filtered from the wordcloud.

  • background_color (str) – Background color for the word cloud image.

  • min_word_length (int) – Minimum number of letters a word must have to be included.

Return type:

Figure

Returns:

Word cloud plot

Usage:

>>> from twixl.collections import twitter
>>> twitter.plotting.plot_word_cloud(
>>>   frequencies,
>>>   stopwords=stopwords,
>>>   max_words=100,
>>>   min_word_length=4
>>> )
<matplotlib.image.AxesImage>

Exceptions

This section of the documentation covers all the exceptions.

exception twixl.collections.twitter.exceptions.QueryFailed(message='query has failed')[source]

The query failed.

exception twixl.collections.twitter.exceptions.QueryCanceled(message='query was canceled')[source]

The query is canceled.

exception twixl.collections.twitter.exceptions.QueryTimeout(timeout)[source]

The query timed out.

Lower-Level Classes

class twixl.collections.twitter.SearchResults(result_urls)[source]

Manages a collection of ResultPartition objects, providing methods to save and download all results.

result_partitions

A list of ResultPartition objects initialized from the provided URLs.

Type:

List[ResultPartition]

download()[source]

Downloads each result file and yields it as a DataFrame.

Yield:

A DataFrame for each downloaded file.

Return type:

Generator[pd.DataFrame, None, None]

download_all()[source]

Downloads all result files and concatenates them into a single DataFrame.

Returns:

A concatenated DataFrame containing all downloaded results.

Return type:

pd.DataFrame

save_as(target_dir)[source]

Saves all result files to the specified target directory.

Parameters:

target_dir (Union[str, Path]) – The directory to save all files to.

Return type:

None

class twixl.collections.twitter.TweetMetrics(metrics)[source]

Manages the metrics of the twitter collection.

to_pandas()[source]

Return metrics as a pandas series.

Returns:

A pandas series containing all metrics of the twitter collection.

Return type:

pd.Series