Dream Analysis in Python - help needed
Hello! I have been trying to write some code in Python that would analyze my dream journal, recognize dreamsigns, and do two things (so far):
(1) Make a plot of dreamsign frequency per week, and
(2) Make a predictive model that would predict dreamsigns that might appear in the next dream
This way I have a visual output of where I have been, and a prediction as to where I'm headed.
I have been working with ChatGPT, and have some code for making a predictive model. What I am having trouble with now is simply making an XY graph of dreamsign frequency, which is probably a very simple fix for anyone out there who knows coding. Here is my code at the moment for this part:
Code:
import csv
import docx
import pandas as pd
from collections import Counter
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import matplotlib.pyplot as plt
# Function to remove stopwords and punctuation from the text
def preprocess_text(text):
stop_words = set(stopwords.words('english'))
tokens = word_tokenize(text.lower())
filtered_tokens = [token for token in tokens if token.isalpha() and token not in stop_words]
return filtered_tokens
filename = 'Dream_Journal.docx' # Input: Word document filename
doc = docx.Document(filename)
# Combine all text in the document
text = ' '.join([paragraph.text for paragraph in doc.paragraphs])
# Preprocess the text by removing stopwords and punctuation
filtered_words = preprocess_text(text)
# Count the occurrences of each word
word_counter = Counter(filtered_words)
# Get the most common words as keywords
num_keywords = 200 # Specify the number of keywords you want
keywords = [word for word, _ in word_counter.most_common(num_keywords)]
data = []
categories = []
while True:
date = input("Enter a date (YYYY-MM-DD) or press Enter to exit: ")
if not date:
break
journal_entry = input("Enter a journal entry: ")
row_keywords = []
for keyword in keywords:
if keyword in journal_entry:
row_keywords.append(keyword)
if keyword not in categories:
categories.append(keyword)
data.append([date, row_keywords])
csv_filename = 'data.csv'
with open(csv_filename, 'a', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerows(data)
print(f'CSV file "{csv_filename}" created successfully.')
# Read the CSV file into a pandas DataFrame
df = pd.read_csv(csv_filename)
# Convert 'date' column to datetime type
df['date'] = pd.to_datetime(df['date'])
# Group by week and calculate the frequency of each keyword
df['week'] = df['date'].dt.to_period('W')
df_grouped = df.groupby(['week']).apply(lambda x: x['categories'].sum()).apply(Counter)
# Create the XY plot
plt.figure(figsize=(12, 6))
for keyword in categories: # Plot based on the list of categories
frequencies = [counter.get(keyword, 0) for counter in df_grouped]
plt.plot(df_grouped.index.astype(str), frequencies, label=keyword)
plt.xlabel('Week')
plt.ylabel('Frequency')
plt.title('Keyword Frequency Over Weeks')
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
When I run this, it plots something at 0 frequency and nothing else.