API Reference

This part of the documentation covers all the interfaces of twitter.

Main interface

All of the twinls’ functionality can be accessed by the following functions and classes.

twixl.collections.twitter.search(query, start_time=None, end_time=None, max_results=None, api=None, callback=None)[source]

Sends a TwiXL search query.

Parameters:

query (Query) – A TwiXL query object that describes the search query.
start_time (Optional[datetime]) – The oldest UTC timestamp from which the Tweets will be provided.
end_time (Optional[datetime]) – The newest UTC timestamp from which the Tweets will be provided.
(Optional) (callback) – The maximum number of search results in query output. By default, all search results will be stored in the query output.
(Optional) – An initialized twixl Api object. If not provided, an object will be initialized.
(Optional) – Callback function taking a QueryStatus object.

Returns:

SearchReuslts object

Return type:

twitter.SearchResults

Usage:

>>> from twixl.collections import twitter
>>> query = (
>>>   twitter.Query().
>>>       .keywords(regex=['twitter', 'tweet'])
>>>       .urls(url_regex: ["twitter.com/twitter*"])
>>> )
>>> search_results = twitter.search(
>>>   query=query,
>>>   start_time=datetime.datetime(2022, 1, 1, 0, 0),
>>>   end_time=datetime.datetime(2022, 1, 1, 23, 59, 59),
>>>   max_results=100,
>>>   callback=twitter.print_callback
>>> )
Query status: RUNNING (0 Bytes scanned)
Query status: DOWNLOADING_RESULTS (1 GB scanned)
>>> SearchResults
SearchResults(...)

twixl.collections.twitter.tweet_metrics(api=None)[source]

Sends a get metrics request to the TwiXL Api and returns the results.

Parameters:: (Optional) (api) – An initialized twixl Api object. If not provided, an object will be initialized.
Returns:: TweetMetrics object
Return type:: twitter.TweetMetrics

Usage:

>>> from twixl.collections import twitter
>>> metrics = twitter.tweet_metrics()
>>> merics.to_pandas()

enum twixl.collections.twitter.dataset(value)[source]

Twitter datasets available on twi-xl.

Valid values are as follows:

ALL = <dataset.ALL: '/tweets/search/all'>

TWINL = <dataset.TWINL: '/tweets/search/twinl'>

POLITICS = <dataset.POLITICS: '/tweets/search/politics'>

class twixl.collections.twitter.Query(dataset=dataset.ALL)[source]

A TwiXL query object to define a search query.

from_userids(userids, overwrite=False)[source]

Add a query statement that matches any tweet from the specific userid(s) to the TwiXL.Query object.

Parameters:: userids (List[str]) – A list of user’s numeric user ID’s
Return type:: Query
Returns:: Query object
Rtpe:: twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().from_userids(userids=['twitterdev', 'twitterapi'])

from_usernames(usernames, overwrite=False)[source]

Add a query statement that matches any tweet from the specific username(s) to the TwiXL.Query object.

Parameters:: usernames (List[str]) – A list of usernames (excluding the @ character)
Return type:: Query
Returns:: Query object
Rtpe:: twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().from_usernames(usernames=['twitterdev', 'twitterapi'])

keywords(keywords)[source]

Add a query statement that matches any tweet with the specified keyword(s).

Parameters:: keywords (List[str]) – A list of words
Return type:: Query
Returns:: Query object
Rtpe:: twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().keywords(keywords=['twitter', 'tweet'])

print()[source]

Print query.

Return type:: None

regex(regex)[source]

Add a query statement that matches any tweet which match any of the specified regular expression(s).

Parameters:: regex (List[str]) – A list of regular expressions
Return type:: Query
Returns:: Query object
Rtpe:: twitter:Query

Usage:

>>> from twixl.collections import Query
>>> Query().regex(regex=['\btwit\w+'])

to_dict()[source]

Returns the TwiXL query as a dict.

Returns:: Query object as dictionary.
Return type:: dict

url(url_regex, must_not=False)[source]

Add a query statement that matches any tweet that contains one of the specified URLs to the TwiXL.Query object.

Parameters:

url – List of url’s or regular expressions
must_not (bool) – If set to True, the list of url’s must not appear in the final results

Returns:

Query object

Return type:

twitter.Query

Usage:

>>> from twixl.collections import Query
>>> Query().url(url_regex: ["twitter.com/twitter*"])

twixl.collections.twitter.API(api_endpoint, api_key)[source]

Plotting functons

This section of the documentation covers all the plotting functions.

twixl.collections.twitter.plotting.plot_tweet_frequencies(tweets, num_xticks=5, title='Number of tweets per day')[source]

Plot the TwiXL Query result in a frequency plot.

Parameters:

search_results – Twitter search results.
num_xticks (int) –
(Optional) (title) – The figure title.

Return type:

Tuple[Figure, Axes]

Returns:

Tweet frequency figure.

Usage:

>>> from twixl.collections import twitter
>>> twitter.plotting.plot_tweet_frequencies(
>>>   tweets,
>>>   title="Number of 'Elfstedentocht' tweets per day"
>>> )
(<Figure>, <AxesSubplot>)

twixl.collections.twitter.plotting.plot_word_cloud(frequencies, width=800, height=400, max_words=200, stopwords=None, background_color='white', min_word_length=0)[source]

Plots the word-frequency list as a wordcloud.

Parameters:

frequencies (Series) – item frequencies as generated by any of the x_frequencies() methods.
width (int) – Width of the canvas.
height (int) – Height of the canvas.
max_words (int) – The maximum number of words in the wordcloud.
stopwords (Optional[List[str]]) – A list of stopwords that should be filtered from the wordcloud.
background_color (str) – Background color for the word cloud image.
min_word_length (int) – Minimum number of letters a word must have to be included.

Return type:

Figure

Returns:

Word cloud plot

Usage:

>>> from twixl.collections import twitter
>>> twitter.plotting.plot_word_cloud(
>>>   frequencies,
>>>   stopwords=stopwords,
>>>   max_words=100,
>>>   min_word_length=4
>>> )
<matplotlib.image.AxesImage>

Exceptions

This section of the documentation covers all the exceptions.

exception twixl.collections.twitter.exceptions.QueryFailed(message='query has failed')[source]: The query failed.

exception twixl.collections.twitter.exceptions.QueryCanceled(message='query was canceled')[source]: The query is canceled.

exception twixl.collections.twitter.exceptions.QueryTimeout(timeout)[source]: The query timed out.

Lower-Level Classes

class twixl.collections.twitter.SearchResults(result_urls)[source]

Manages a collection of ResultPartition objects, providing methods to save and download all results.

result_partitions

A list of ResultPartition objects initialized from the provided URLs.

Type:: List[ResultPartition]

download()[source]

Downloads each result file and yields it as a DataFrame.

Yield:: A DataFrame for each downloaded file.
Return type:: Generator[pd.DataFrame, None, None]

download_all()[source]

Downloads all result files and concatenates them into a single DataFrame.

Returns:: A concatenated DataFrame containing all downloaded results.
Return type:: pd.DataFrame

save_as(target_dir)[source]

Saves all result files to the specified target directory.

Parameters:: target_dir (Union[str, Path]) – The directory to save all files to.
Return type:: None

class twixl.collections.twitter.TweetMetrics(metrics)[source]

Manages the metrics of the twitter collection.

to_pandas()[source]

Return metrics as a pandas series.

Returns:: A pandas series containing all metrics of the twitter collection.
Return type:: pd.Series