API reference

The LIWC module provides a Python interface to the Linguistic Inquiry and Word Count (LIWC) tool, allowing users to perform text analysis.

Initialization


source

Liwc

 Liwc (liwc_cli_path:str='LIWC-22-cli', threads:Optional[int]=None,
       verbose:bool=False)

Initialize the LIWC Class.

Type Default Details
liwc_cli_path str LIWC-22-cli LIWC CLI Path.
threads Optional None Number of threads to use. Defaults to the number of CPU cores minus one.
verbose bool False Display printing and progress bar. Defaults to False.

LIWC Analysis


source

Liwc.analyze_string_to_json

 Liwc.analyze_string_to_json (input_string:str, liwc_dict:str='LIWC22')

*Analyze a single string and return the result as JSON.

Returns: dict:*

Type Default Details
input_string str The string to analyze.
liwc_dict str LIWC22 Dictionary to use for analysis. Defaults to “LIWC22”.
Returns dict Analysis results in JSON format.

source

Liwc.analyze_string

 Liwc.analyze_string (input_string:str, output_location:str,
                      liwc_dict:str='LIWC22')

Analyze a single string using LIWC and save to csv.

Type Default Details
input_string str The string to analyze.
output_location str Path to save the analysis output (.csv).
liwc_dict str LIWC22
Returns None Dictionary to use for analysis. Defaults to “LIWC22”.

source

Liwc.analyze_df

 Liwc.analyze_df (text:pandas.core.series.Series, return_input:bool=False,
                  liwc_dict:str='LIWC22')

Analyze text data from a DataFrame using LIWC.

Type Default Details
text Series Pandas Series containing text data.
return_input bool False Whether to return the input text with the output. Defaults to False.
liwc_dict str LIWC22 Dictionary to use for analysis. Defaults to “LIWC22”.
Returns DataFrame pd.DataFrame: DataFrame containing the analysis results.

source

Liwc.analyze_folder

 Liwc.analyze_folder (input_folder:str, output_location:str,
                      liwc_dict:str='LIWC22')

Analyze all text files in a folder using LIWC.

Type Default Details
input_folder str Path to the folder containing text files.
output_location str Path to save the analysis output.
liwc_dict str LIWC22
Returns None Dictionary to use for analysis. Defaults to “LIWC22”.

source

Liwc.analyze_csv

 Liwc.analyze_csv (input_file:str, output_location:str,
                   row_id_indices:str, column_indices:str,
                   liwc_dict:str='LIWC22')

Analyze text data from a CSV file using LIWC.

Type Default Details
input_file str Path to the input CSV file.
output_location str Path to save the analysis output.
row_id_indices str Indices of row IDs in the CSV.
column_indices str Indices of text columns in the CSV.
liwc_dict str LIWC22
Returns None Dictionary to use for analysis. Defaults to “LIWC22”.
# liwc = Liwc('LIWC-22-cli.exe', verbose=True)
# s = "As Leclerc entered the Invalides, with his cortege of exaltation in the sun of Africa and the battles of Alsace, enter here, Jean Moulin, with your terrible cortege."
# r = liwc.analyze_string_to_json(s)
# desired_keys = ['WC', 'Analytic', 'Clout', 'Authentic', 'Tone']
# filtered_dict = {key: r[key] for key in desired_keys if key in r}
# print(filtered_dict)

Language Style Matching


source

Liwc.analyze_lsm

 Liwc.analyze_lsm (df:pandas.core.frame.DataFrame,
                   calculate_lsm:str='person-and-group',
                   group_column:str='GroupID',
                   person_column:str='PersonID', text_column:str='Text',
                   output_type:str='pairwise', expanded_output:bool=False,
                   omit_speakers_number_of_turns:int=0,
                   omit_speakers_word_count:int=10,
                   segmentation:str='none', wsl_mode:bool=True)

Analyzes Linguistic Style Matching (LSM) based on the provided DataFrame.

Type Default Details
df DataFrame Input DataFrame containing the text data to be analyzed.
calculate_lsm str person-and-group Sets the type of LSM calculation. Options are:
- “person”: Calculate only person-level LSM.
- “group”: Calculate only group-level LSM.
- “person-and-group”: Calculate both person and group-level LSM.
Default is “person-and-group”.
group_column str GroupID The column name in df representing the Group ID. Default is ‘GroupID’.
person_column str PersonID The column name in df representing the Person ID. Default is ‘PersonID’.
text_column str Text The column name in df representing the text data. Default is ‘Text’.
output_type str pairwise Sets the type of output. Options are:
- “one-to-many”: One-to-many comparison.
- “pairwise”: Pairwise comparison.
Default is “pairwise”.
expanded_output bool False Adds an option to get an expanded LSM output. Default is False.
omit_speakers_number_of_turns int 0
omit_speakers_word_count int 10 Omit speakers if the word count is less than this value. Default is 10.
segmentation str none Segmentation options for splitting the text. Options are:
- “none”: No segmentation.
- “not=”: Number of turns per segment.
- “nofst=”: Number of segments by speaker turn.
- “nofwc=”: Number of segments by word count.
- “now=”: Number of words per segment.
- “boc=”: Segmentation based on characters.
- “regexp=”: Segmentation based on a regular expression.
Default is “none”.
wsl_mode bool True Whether to convert paths for WSL. Defaults to True.
Returns Union The resulting LSM analysis. The output format depends on the specified output_format.

Narrative arc


source

Liwc.plot_narrative_arc

 Liwc.plot_narrative_arc (df:pandas.core.frame.DataFrame,
                          legend_labels:list=None)

Plots the narrative arc for the given DataFrame, showing Staging, Plot Progression, and Cognitive Tension.

Type Default Details
df DataFrame Input DataFrame containing the narrative arc data.
Note: ‘output_individual_data_points=True’ in narrative_arc to get all required data to plot the narractive arc.
legend_labels list None List of labels for the legend, corresponding to each row in the DataFrame.
Returns Figure The resulting plot figure of the narrative arcs.

source

Liwc.narrative_arc

 Liwc.narrative_arc (df:pandas.core.frame.DataFrame,
                     column_names:Optional[list]=None,
                     output_individual_data_points:bool=True,
                     scaling_method:str='0-100', segments_number:int=5,
                     skip_wc:int=10)

Analyzes the narrative arc of text data based on the provided DataFrame.

Type Default Details
df DataFrame Input DataFrame containing the text data to be analyzed.
column_names Optional None List of column names in df that should be processed. If None, all columns are processed. Default is None.
output_individual_data_points bool True If True, outputs individual data points for each segment. If False, aggregates the data. Default is True.
scaling_method str 0-100 Method for scaling the data. Options are:
- “0-100”: Scale values between 0 and 100.
- “Z-score”: Scale values using Z-score normalization.
Default is “0-100”.
segments_number int 5 Number of segments into which the text is divided for analysis. Default is 5.
skip_wc int 10 Skip any texts with a word count less than this value. Default is 10.
Returns DataFrame The resulting DataFrame with the narrative arc analysis.