API reference

The LIWC module provides a Python interface to the Linguistic Inquiry and Word Count (LIWC) tool, allowing users to perform text analysis.

Initialization

source

Liwc

 Liwc (liwc_cli_path:str='LIWC-22-cli', threads:Optional[int]=None,
       verbose:bool=False)

Initialize the LIWC Class.

	Type	Default	Details
liwc_cli_path	str	LIWC-22-cli	LIWC CLI Path.
threads	Optional	None	Number of threads to use. Defaults to the number of CPU cores minus one.
verbose	bool	False	Display printing and progress bar. Defaults to False.

LIWC Analysis

source

Liwc.analyze_string_to_json

 Liwc.analyze_string_to_json (input_string:str, liwc_dict:str='LIWC22')

*Analyze a single string and return the result as JSON.

Returns: dict:*

	Type	Default	Details
input_string	str		The string to analyze.
liwc_dict	str	LIWC22	Dictionary to use for analysis. Defaults to “LIWC22”.
Returns	dict		Analysis results in JSON format.

source

Liwc.analyze_string

 Liwc.analyze_string (input_string:str, output_location:str,
                      liwc_dict:str='LIWC22')

Analyze a single string using LIWC and save to csv.

	Type	Default	Details
input_string	str		The string to analyze.
output_location	str		Path to save the analysis output (.csv).
liwc_dict	str	LIWC22
Returns	None		Dictionary to use for analysis. Defaults to “LIWC22”.

source

Liwc.analyze_df

 Liwc.analyze_df (text:pandas.core.series.Series, return_input:bool=False,
                  liwc_dict:str='LIWC22')

Analyze text data from a DataFrame using LIWC.

	Type	Default	Details
text	Series		Pandas Series containing text data.
return_input	bool	False	Whether to return the input text with the output. Defaults to False.
liwc_dict	str	LIWC22	Dictionary to use for analysis. Defaults to “LIWC22”.
Returns	DataFrame		pd.DataFrame: DataFrame containing the analysis results.

source

Liwc.analyze_folder

 Liwc.analyze_folder (input_folder:str, output_location:str,
                      liwc_dict:str='LIWC22')

Analyze all text files in a folder using LIWC.

	Type	Default	Details
input_folder	str		Path to the folder containing text files.
output_location	str		Path to save the analysis output.
liwc_dict	str	LIWC22
Returns	None		Dictionary to use for analysis. Defaults to “LIWC22”.

source

Liwc.analyze_csv

 Liwc.analyze_csv (input_file:str, output_location:str,
                   row_id_indices:str, column_indices:str,
                   liwc_dict:str='LIWC22')

Analyze text data from a CSV file using LIWC.

	Type	Default	Details
input_file	str		Path to the input CSV file.
output_location	str		Path to save the analysis output.
row_id_indices	str		Indices of row IDs in the CSV.
column_indices	str		Indices of text columns in the CSV.
liwc_dict	str	LIWC22
Returns	None		Dictionary to use for analysis. Defaults to “LIWC22”.

# liwc = Liwc('LIWC-22-cli.exe', verbose=True)
# s = "As Leclerc entered the Invalides, with his cortege of exaltation in the sun of Africa and the battles of Alsace, enter here, Jean Moulin, with your terrible cortege."
# r = liwc.analyze_string_to_json(s)

# desired_keys = ['WC', 'Analytic', 'Clout', 'Authentic', 'Tone']
# filtered_dict = {key: r[key] for key in desired_keys if key in r}
# print(filtered_dict)

Language Style Matching

source

Liwc.analyze_lsm

 Liwc.analyze_lsm (df:pandas.core.frame.DataFrame,
                   calculate_lsm:str='person-and-group',
                   group_column:str='GroupID',
                   person_column:str='PersonID', text_column:str='Text',
                   output_type:str='pairwise', expanded_output:bool=False,
                   omit_speakers_number_of_turns:int=0,
                   omit_speakers_word_count:int=10,
                   segmentation:str='none', wsl_mode:bool=True)

Analyzes Linguistic Style Matching (LSM) based on the provided DataFrame.

	Type	Default	Details
df	DataFrame		Input DataFrame containing the text data to be analyzed.
calculate_lsm	str	person-and-group	Sets the type of LSM calculation. Options are: - “person”: Calculate only person-level LSM. - “group”: Calculate only group-level LSM. - “person-and-group”: Calculate both person and group-level LSM. Default is “person-and-group”.
group_column	str	GroupID	The column name in `df` representing the Group ID. Default is ‘GroupID’.
person_column	str	PersonID	The column name in `df` representing the Person ID. Default is ‘PersonID’.
text_column	str	Text	The column name in `df` representing the text data. Default is ‘Text’.
output_type	str	pairwise	Sets the type of output. Options are: - “one-to-many”: One-to-many comparison. - “pairwise”: Pairwise comparison. Default is “pairwise”.
expanded_output	bool	False	Adds an option to get an expanded LSM output. Default is False.
omit_speakers_number_of_turns	int	0
omit_speakers_word_count	int	10	Omit speakers if the word count is less than this value. Default is 10.
segmentation	str	none	Segmentation options for splitting the text. Options are: - “none”: No segmentation. - “not=”: Number of turns per segment. - “nofst=”: Number of segments by speaker turn. - “nofwc=”: Number of segments by word count. - “now=”: Number of words per segment. - “boc=”: Segmentation based on characters. - “regexp=”: Segmentation based on a regular expression. Default is “none”.
wsl_mode	bool	True	Whether to convert paths for WSL. Defaults to True.
Returns	Union		The resulting LSM analysis. The output format depends on the specified `output_format`.

Narrative arc

source

Liwc.plot_narrative_arc

 Liwc.plot_narrative_arc (df:pandas.core.frame.DataFrame,
                          legend_labels:list=None)

Plots the narrative arc for the given DataFrame, showing Staging, Plot Progression, and Cognitive Tension.

	Type	Default	Details
df	DataFrame		Input DataFrame containing the narrative arc data. Note: ‘output_individual_data_points=True’ in narrative_arc to get all required data to plot the narractive arc.
legend_labels	list	None	List of labels for the legend, corresponding to each row in the DataFrame.
Returns	Figure		The resulting plot figure of the narrative arcs.

source

Liwc.narrative_arc

 Liwc.narrative_arc (df:pandas.core.frame.DataFrame,
                     column_names:Optional[list]=None,
                     output_individual_data_points:bool=True,
                     scaling_method:str='0-100', segments_number:int=5,
                     skip_wc:int=10)

Analyzes the narrative arc of text data based on the provided DataFrame.

	Type	Default	Details
df	DataFrame		Input DataFrame containing the text data to be analyzed.
column_names	Optional	None	List of column names in `df` that should be processed. If None, all columns are processed. Default is None.
output_individual_data_points	bool	True	If True, outputs individual data points for each segment. If False, aggregates the data. Default is True.
scaling_method	str	0-100	Method for scaling the data. Options are: - “0-100”: Scale values between 0 and 100. - “Z-score”: Scale values using Z-score normalization. Default is “0-100”.
segments_number	int	5	Number of segments into which the text is divided for analysis. Default is 5.
skip_wc	int	10	Skip any texts with a word count less than this value. Default is 10.
Returns	DataFrame		The resulting DataFrame with the narrative arc analysis.