Python :Altair Python Package

Description

This was a project that takes web scraped data that showcases the scripts from multiple youtube video essays. The goal is to produced a striking visualization of the topics discussed the most in these videos. The videos cover a variety of topics including political but also on series like Lord of the Rings. As Youtube videos become a growing portion of content younger audiences consume these types of visualizations serve as as sign of some of the topics the youth and content creators are discussing.

import pandas as pd
import numpy as np
import re
! pip install nltk
import nltk
nltk.download('stopwords')
# Gensim, for topic modeling
import gensim
import gensim.corpora as corpora
from gensim.utils import simple_preprocess
from gensim.models import CoherenceModel

# Plotting tools
import matplotlib.pyplot as plt
%matplotlib inline

    Requirement already satisfied: nltk in c:\users\colto\anaconda3\lib\site-packages (3.8.1)
    Requirement already satisfied: click in c:\users\colto\anaconda3\lib\site-packages (from nltk) (8.0.4)
    Requirement already satisfied: joblib in c:\users\colto\anaconda3\lib\site-packages (from nltk) (1.2.0)
    Requirement already satisfied: regex>=2021.8.3 in c:\users\colto\anaconda3\lib\site-packages (from nltk) (2022.7.9)
    Requirement already satisfied: tqdm in c:\users\colto\anaconda3\lib\site-packages (from nltk) (4.65.0)
    Requirement already satisfied: colorama in c:\users\colto\anaconda3\lib\site-packages (from click->nltk) (0.4.6)
    

    [nltk_data] Downloading package stopwords to
    [nltk_data]     C:\Users\colto\AppData\Roaming\nltk_data...
    [nltk_data]   Unzipping corpora\stopwords.zip.
    


df = pd.read_csv("Data/YoutubeVideoEssayProject.csv")
df = df.drop(columns ="Unnamed: 9")
df
Title Views Likes Date Length Transcript Description Creator Creator Subscribers
0 David Lynch: The Treachery of Language 275,647 14,000 3/30/18 11:10 0:00\n[Interviewer] “David Lynch has described... David Lynch is famous for his reluctance to ve... What's So Great About That? 79,900
1 CTRL+ALT+DEL | SLA:3 2,400,000 83,000 4/27/18 34:10 0:00\nHi, I'm Hareton Splimby, and welcome to ... In attempting to go fast, Hareton Splimby suff... hbomberguy 1,290,000
2 The Hobbit: A Long-Expected Autopsy (Part 1/2) 5,000,000 126,000 3/28/18 36:48 0:07\nIn mid-2015, less than a year before her... In which we look back at The Hobbit trilogy an... Lindsay Ellis 5,086,953
3 Making Games Better for Gamers with Colourblin... 370,306 19,000 8/22/18 13:55 0:00\nVideo games are for everyone, and they c... Video games are for everyone. But disabled peo... Game Maker's Toolkit 1,510,000
4 FAKE FRIENDS EPISODE TWO: parasocial hell 224,898 10,000 8/11/18 1:54:34 0:03\n[Shannon] Grape-kun was a real-life peng... it's done!!!!\n\n \n\n / struccimovies \nhttp... StrucciMovies 46,700
5 Incels | ContraPoints 5,823,697 243,000 8/17/18 34:05 0:00\n[Mendelssohn: String Quartet No. 6 in F ... Hello boys. Let's talk about bone structure.\n... ContraPoints 5,823,697
6 DOOM: The Fake Outrage 845,440 38,000 9/1/18 24:32 0:00\nhello everyone today we're going to be\n... Countdown to the first accusation of meta-meta... Shaun 662,000
7 Disney - The Magic of Animation 610,400 45,000 10/3/18 15:47 0:00\nforeign\n0:07\nwhat is it about Disney A... A look at the 12 principles of animation devel... kaptainkristian 45,000
8 Nostalghia Critique 106,500 5,400 11/27/18 9:11 0:07\nThere are a few things you can't do on Y... A reflection on cinema, self, and other nonsen... KyleKallgreen 80,800
9 In Search Of A Flat Earth 3,423,150 129,000 9/11/20 1:16:16 Prologue\n0:00\n[Laid back folk music] A few m... Clickbait Title: The Twist at 37 Minutes Will ... Folding Ideas 892,000
10 The Satirical Resurgence of Reefer Madness 81,974 6,900 11/10/20 26:58 Transcript https://www.snap4freedom.org/home\n \n\n / yha... Yhara zayd 242,000
11 The Strange Reality of Roller Coaster Tycoon 1,383,521 55,000 7/19/20 18:11 0:09\nThere is at least one roller coaster des... Both birds are yellow but the louder one is ye... Jacob Geller 1,150,000
12 CATS & The Weird Mind of TS Eliot | An Analysis 334,188 16,000 3/24/20 58:50 0:12\nthe speaking on eliot is a difficult mat... If you want to directly support me and see thi... Maggie Mae Fish 221,000
13 The Anatomy of Stan Culture 108,681 8,200 3/8/20 18:28 0:00\nthe audience is important to any\n0:01\n... How much do you love celebrities? As a fan, wh... Intelexual Media 281,000
14 On Writing: Mental Illness in Video Games | a ... 246,532 21,000 4/3/20 33:33 0:00\nthis video is going to deal with sensiti... It's only because of independent support throu... Hello Future Me 1,008,000
15 Why Anime is for Black People - Hip Hop x Anim... 180,348 15,000 9/25/20 18:34 0:00\nanybody who's been alive in the past 20\... Over the years, it's hard not to point out ani... Beyond The Bot 84,100
16 What Is *Good* Queer Representation in 2020? 181,933 12,000 8/14/20 48:21 0:00\nthis was actually a really difficult vid... Clips and Spoilers from the following shows:\n... Princess Weekes 222,000
17 Fallout: New Vegas Is Genius, And Here's Why 9,754,066 251,000 12/19/20 1:37:41 0:00\n- The first two "Fallouts" are still two... (Spoilers for New Vegas, obviously)\n\nMy Twit... hbomberguy 1,290,000
18 Whisper of the Heart: How Does It Feel to Be a... 126,458 9,500 5/28/20 13:30 0:03\n"Alright..."\n0:05\n"And action!"\n0:06\... Accented Cinema - Episode 37\n\nThis is a bit ... Accented Cinema 456,000
19 Your Island is a Commune pt. 1 | Nowhere Grotesk 87,724 7,300 1/11/20 22:31 0:00\nI've never engaged with the game the way... Capitalism or community, choose one. Join us n... Nowhere Grotesk 8,510
20 The Market of Humiliating Black Women 939,173 102,000 2/22/21 26:11 0:00\nHi everyone. Welcome to this video. My ... Hey cuties welcome back :) In today's video I'... Tee Noir 631,000
21 The Day Rue "Became" Black 1,965,436 126,000 5/19/21 35:33 0:00\nGreetings and salutations, before we get... Go to https://nordvpn.com/yhara or use code yh... Yhara zayd 242,000
22 Infantilization and the Body Hair Debate 1,930,899 140,000 8/14/21 35:40 1:09\nSo, I got Brazilian wax recently. [cric... it took 13 hours to edit this.\nSOCIALS:\nko-f... Shanspeare 590,000
23 Bo Burnham vs. Jeff Bezos 1,345,904 80,000 8/20/21 2:26:41 0:01\n[ Screams ] Every book of the Bible was ... COVID make man sad\n\nPatreon: \n\n / cjthex ... CJ The X 297,000
24 The reign of the Slim-Thick Influencer | Khadi... 1,886,067 114,000 8/22/21 54:18 0:00\nThis video is brought to you by Squaresp... Head to ​https://www.squarespace.com/khadijamb... Khadija Mbowe 611,000
25 make more characters bi, you cowards: why (not... 8,217 NaN 8/28/21 51:16 0:00\nhello void and all who inhabit it, it's ... sorry again for the glitchy video. the perks o... voice memos for the void 10,000
26 The Black Right Wing || Anansi’s Library 26,220 2,400 9/24/21 8:17 0:00\nour justice system is gone blm has taken... NEW TWITTER: \n\n / localpunkanansi NEW PAT... Anansi's Library 25,200
27 On Leftist Disunity 72,060 7,200 10/27/21 11:34 0:00\nNot long after I’d become radicalized, b... Watch the whole video before commenting. As I’... Andrewism 161,000
28 Break Bread 751,463 50,000 12/6/21 1:40:30 0:04\nso in the following years as the anti-sj... After a few months of “success” on the platfor... F.D Signifier 611,000
29 Meet Dave | Captain Ahab: The Story of Dave St... 828,195 18,000 3/1/22 48:33 0:00\n(pulsing music)\n0:05\n- [Jon] It's the ... “Who’s Dave Stieb?” you might be asking. Well,... Secret Base 1,360,000
30 Why Panzer Dragoon Saga is the Greatest RPG No... 112,523 6,700 6/24/22 53:30 0:02\nearth may not be forever but we still ha... This is the story of the most important and in... Micheal Saba 86,300
31 Nice White Teachers, Bad Brown Schools: Hollyw... 331,890 22,000 6/25/22 40:53 0:18\nthis video is brought to you by mubi a c... Hirokazu Koreeda: A Double Bill is now streami... Yhara zayd 242,000
32 Instagram Hates Its Users 1,001,437 57,000 8/31/22 31:33 0:00\neveryone is mad at instagram right now a... Use code JARVIS130 to get $130 off across 6 Fa... Jarvis Johnson 2,040,000
33 Fixing My Brain with Automated Therapy 979,429 57,000 9/2/22 52:50 0:00\nnow when we hear the word therapy, certa... People laugh about this, self-soothing engines... Jacob Geller 1,150,000
34 Parking lots are everywhere and nowhere 13,904 1,200 9/22/22 11:34 0:00\nWe all have those formative childhood ex... Parking lots are the ambient architecture of a... What's So Great About That? 79,900
35 How Degrowth Can Save The World 136,004 9,700 11/2/22 36:54 0:00\nour world is dying. Or more accurately, ... Capitalism is based on the cancerous logic of ... Andrewism 161,000
#cleaned text in df
def clean_text(list_): 
    regex_newline = re.compile(r'\n')
    regex_timestamp = re.compile(r'\d:\d\d')
    regex_whitespace = re.compile(r'\s{2,}')
    for index, item in enumerate(list_):
        item = str(item)
        item = re.sub(regex_newline, " ", item)
        item = re.sub(regex_timestamp, " ", item)
        item = re.sub(regex_whitespace, " ", item)
        item = re.sub(r'"', "", item)
        item = re.sub(r'\s1\s', ' ', item)
        item = re.sub(r'\s2\s', ' ', item)
        list_[index] = item
    return list_
df["Transcript"] = clean_text(df["Transcript"])
df["Description"] = clean_text(df["Description"])
df
Title Views Likes Date Length Transcript Description Creator Creator Subscribers
0 David Lynch: The Treachery of Language 275,647 14,000 3/30/18 11:10 [Interviewer] “David Lynch has described his ... David Lynch is famous for his reluctance to ve... What's So Great About That? 79,900
1 CTRL+ALT+DEL | SLA:3 2,400,000 83,000 4/27/18 34:10 Hi, I'm Hareton Splimby, and welcome to Serio... In attempting to go fast, Hareton Splimby suff... hbomberguy 1,290,000
2 The Hobbit: A Long-Expected Autopsy (Part 1/2) 5,000,000 126,000 3/28/18 36:48 In mid-2015, less than a year before her deat... In which we look back at The Hobbit trilogy an... Lindsay Ellis 5,086,953
3 Making Games Better for Gamers with Colourblin... 370,306 19,000 8/22/18 13:55 Video games are for everyone, and they can ev... Video games are for everyone. But disabled peo... Game Maker's Toolkit 1,510,000
4 FAKE FRIENDS EPISODE TWO: parasocial hell 224,898 10,000 8/11/18 1:54:34 [Shannon] Grape-kun was a real-life penguin t... it's done!!!! / struccimovies https://ko-fi.co... StrucciMovies 46,700
5 Incels | ContraPoints 5,823,697 243,000 8/17/18 34:05 [Mendelssohn: String Quartet No. 6 in F minor... Hello boys. Let's talk about bone structure. S... ContraPoints 5,823,697
6 DOOM: The Fake Outrage 845,440 38,000 9/1/18 24:32 hello everyone today we're going to be talkin... Countdown to the first accusation of meta-meta... Shaun 662,000
7 Disney - The Magic of Animation 610,400 45,000 10/3/18 15:47 foreign what is it about Disney Animation tha... A look at the 12 principles of animation devel... kaptainkristian 45,000
8 Nostalghia Critique 106,500 5,400 11/27/18 9:11 There are a few things you can't do on YouTub... A reflection on cinema, self, and other nonsen... KyleKallgreen 80,800
9 In Search Of A Flat Earth 3,423,150 129,000 9/11/20 1:16:16 Prologue [Laid back folk music] A few minutes ... Clickbait Title: The Twist at 37 Minutes Will ... Folding Ideas 892,000
10 The Satirical Resurgence of Reefer Madness 81,974 6,900 11/10/20 26:58 Transcript https://www.snap4freedom.org/home / yharazayd ... Yhara zayd 242,000
11 The Strange Reality of Roller Coaster Tycoon 1,383,521 55,000 7/19/20 18:11 There is at least one roller coaster designed... Both birds are yellow but the louder one is ye... Jacob Geller 1,150,000
12 CATS & The Weird Mind of TS Eliot | An Analysis 334,188 16,000 3/24/20 58:50 the speaking on eliot is a difficult matter, ... If you want to directly support me and see thi... Maggie Mae Fish 221,000
13 The Anatomy of Stan Culture 108,681 8,200 3/8/20 18:28 the audience is important to any celebrity's ... How much do you love celebrities? As a fan, wh... Intelexual Media 281,000
14 On Writing: Mental Illness in Video Games | a ... 246,532 21,000 4/3/20 33:33 this video is going to deal with sensitive an... It's only because of independent support throu... Hello Future Me 1,008,000
15 Why Anime is for Black People - Hip Hop x Anim... 180,348 15,000 9/25/20 18:34 anybody who's been alive in the past 20 years... Over the years, it's hard not to point out ani... Beyond The Bot 84,100
16 What Is *Good* Queer Representation in 2020? 181,933 12,000 8/14/20 48:21 this was actually a really difficult video to... Clips and Spoilers from the following shows: S... Princess Weekes 222,000
17 Fallout: New Vegas Is Genius, And Here's Why 9,754,066 251,000 12/19/20 1:37:41 - The first two Fallouts are still two of the... (Spoilers for New Vegas, obviously) My Twitter... hbomberguy 1,290,000
18 Whisper of the Heart: How Does It Feel to Be a... 126,458 9,500 5/28/20 13:30 Alright... And action! Being an artist myself... Accented Cinema - Episode 37 This is a bit of ... Accented Cinema 456,000
19 Your Island is a Commune pt. 1 | Nowhere Grotesk 87,724 7,300 1/11/20 22:31 I've never engaged with the game the way I do... Capitalism or community, choose one. Join us n... Nowhere Grotesk 8,510
20 The Market of Humiliating Black Women 939,173 102,000 2/22/21 26:11 Hi everyone. Welcome to this video. My name i... Hey cuties welcome back :) In today's video I'... Tee Noir 631,000
21 The Day Rue "Became" Black 1,965,436 126,000 5/19/21 35:33 Greetings and salutations, before we get into... Go to https://nordvpn.com/yhara or use code yh... Yhara zayd 242,000
22 Infantilization and the Body Hair Debate 1,930,899 140,000 8/14/21 35:40 So, I got Brazilian wax recently. [crickets]... it took 13 hours to edit this. SOCIALS: ko-fi:... Shanspeare 590,000
23 Bo Burnham vs. Jeff Bezos 1,345,904 80,000 8/20/21 2:26:41 [ Screams ] Every book of the Bible was writt... COVID make man sad Patreon: / cjthex Twitter: ... CJ The X 297,000
24 The reign of the Slim-Thick Influencer | Khadi... 1,886,067 114,000 8/22/21 54:18 This video is brought to you by Squarespace. ... Head to ​https://www.squarespace.com/khadijamb... Khadija Mbowe 611,000
25 make more characters bi, you cowards: why (not... 8,217 NaN 8/28/21 51:16 hello void and all who inhabit it, it's me an... sorry again for the glitchy video. the perks o... voice memos for the void 10,000
26 The Black Right Wing || Anansi’s Library 26,220 2,400 9/24/21 8:17 our justice system is gone blm has taken over... NEW TWITTER: / localpunkanansi NEW PATREON LIN... Anansi's Library 25,200
27 On Leftist Disunity 72,060 7,200 10/27/21 11:34 Not long after I’d become radicalized, began ... Watch the whole video before commenting. As I’... Andrewism 161,000
28 Break Bread 751,463 50,000 12/6/21 1:40:30 so in the following years as the anti-sjw cha... After a few months of “success” on the platfor... F.D Signifier 611,000
29 Meet Dave | Captain Ahab: The Story of Dave St... 828,195 18,000 3/1/22 48:33 (pulsing music) - [Jon] It's the final game o... “Who’s Dave Stieb?” you might be asking. Well,... Secret Base 1,360,000
30 Why Panzer Dragoon Saga is the Greatest RPG No... 112,523 6,700 6/24/22 53:30 earth may not be forever but we still have th... This is the story of the most important and in... Micheal Saba 86,300
31 Nice White Teachers, Bad Brown Schools: Hollyw... 331,890 22,000 6/25/22 40:53 this video is brought to you by mubi a curate... Hirokazu Koreeda: A Double Bill is now streami... Yhara zayd 242,000
32 Instagram Hates Its Users 1,001,437 57,000 8/31/22 31:33 everyone is mad at instagram right now and fo... Use code JARVIS130 to get $130 off across 6 Fa... Jarvis Johnson 2,040,000
33 Fixing My Brain with Automated Therapy 979,429 57,000 9/2/22 52:50 now when we hear the word therapy, certain th... People laugh about this, self-soothing engines... Jacob Geller 1,150,000
34 Parking lots are everywhere and nowhere 13,904 1,200 9/22/22 11:34 We all have those formative childhood experie... Parking lots are the ambient architecture of a... What's So Great About That? 79,900
35 How Degrowth Can Save The World 136,004 9,700 11/2/22 36:54 our world is dying. Or more accurately, it is... Capitalism is based on the cancerous logic of ... Andrewism 161,000
# Initialize TfidfVectorizer, using English stopwords and converting words to lowercase
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf_vectorizer = TfidfVectorizer(stop_words='english', lowercase=True)
tfidf_matrix = tfidf_vectorizer.fit_transform(df['Transcript']) # Generate a matrix
tfidf_df = pd.DataFrame(tfidf_matrix.toarray(), columns=tfidf_vectorizer.get_feature_names_out()) # Convert matrix to dataframe
tfidf_df.set_index(df['Title'], inplace=True) 
tfidf_df = tfidf_df.stack().reset_index()
tfidf_df = tfidf_df.rename(columns={0:'tfidf', 'Title': 'document','level_1': 'term'})
tfidf_df.sort_values(by=['document','tfidf'], ascending=[True,False]).groupby(['document']).head(1)
document term tfidf
306869 Bo Burnham vs. Jeff Bezos bezos 0.593870
374618 Break Bread content 0.307921
163352 CATS & The Weird Mind of TS Eliot | An Analysis eliot 0.510959
15631 CTRL+ALT+DEL | SLA:3 comic 0.354365
83354 DOOM: The Fake Outrage doom 0.477031
7132 David Lynch: The Treachery of Language lynch 0.663959
93322 Disney - The Magic of Animation action 0.376898
58408 FAKE FRIENDS EPISODE TWO: parasocial hell grape 0.437322
230324 Fallout: New Vegas Is Genius, And Here's Why fallout 0.400686
442363 Fixing My Brain with Automated Therapy eliza 0.331601
470344 How Degrowth Can Save The World growth 0.429553
124277 In Search Of A Flat Earth flat 0.596912
72416 Incels | ContraPoints incels 0.568979
297674 Infantilization and the Body Hair Debate hair 0.393264
431326 Instagram Hates Its Users instagram 0.592285
42176 Making Games Better for Gamers with Colourblin... colour 0.305557
396597 Meet Dave | Captain Ahab: The Story of Dave St... stieb 0.667119
423654 Nice White Teachers, Bad Brown Schools: Hollyw... teachers 0.389858
108495 Nostalghia Critique clips 0.305741
365537 On Leftist Disunity left 0.302464
194845 On Writing: Mental Illness in Video Games | a ... player 0.357307
453521 Parking lots are everywhere and nowhere car 0.339618
174633 The Anatomy of Stan Culture celebrity 0.539281
346796 The Black Right Wing || Anansi’s Library black 0.342267
289197 The Day Rue "Became" Black rue 0.445351
32224 The Hobbit: A Long-Expected Autopsy (Part 1/2) hobbit 0.543211
267080 The Market of Humiliating Black Women black 0.325939
145030 The Satirical Resurgence of Reefer Madness transcript 1.000000
148394 The Strange Reality of Roller Coaster Tycoon coaster 0.635670
320313 The reign of the Slim-Thick Influencer | Khadi... body 0.376040
221973 What Is *Good* Queer Representation in 2020? queer 0.537510
249805 Whisper of the Heart: How Does It Feel to Be a... shizuku 0.315560
199964 Why Anime is for Black People - Hip Hop x Anim... anime 0.524849
402265 Why Panzer Dragoon Saga is the Greatest RPG No... dragoon 0.515901
265197 Your Island is a Commune pt. 1 | Nowhere Grotesk village 0.433637
333442 make more characters bi, you cowards: why (not... bi 0.721916
!pip install altair
Requirement already satisfied: altair in c:\users\colto\anaconda3\lib\site-packages (5.1.2)
Requirement already satisfied: jinja2 in c:\users\colto\anaconda3\lib\site-packages (from altair) (3.1.2)
Requirement already satisfied: jsonschema>=3.0 in c:\users\colto\anaconda3\lib\site-packages (from altair) (4.17.3)
Requirement already satisfied: numpy in c:\users\colto\anaconda3\lib\site-packages (from altair) (1.24.3)
Requirement already satisfied: packaging in c:\users\colto\anaconda3\lib\site-packages (from altair) (23.1)
Requirement already satisfied: pandas>=0.25 in c:\users\colto\anaconda3\lib\site-packages (from altair) (2.0.3)
Requirement already satisfied: toolz in c:\users\colto\anaconda3\lib\site-packages (from altair) (0.12.0)
Requirement already satisfied: attrs>=17.4.0 in c:\users\colto\anaconda3\lib\site-packages (from jsonschema>=3.0->altair) (22.1.0)
Requirement already satisfied: pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0 in c:\users\colto\anaconda3\lib\site-packages (from jsonschema>=3.0->altair) (0.18.0)
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\colto\anaconda3\lib\site-packages (from pandas>=0.25->altair) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\colto\anaconda3\lib\site-packages (from pandas>=0.25->altair) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in c:\users\colto\anaconda3\lib\site-packages (from pandas>=0.25->altair) (2023.3)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\colto\anaconda3\lib\site-packages (from jinja2->altair) (2.1.1)
Requirement already satisfied: six>=1.5 in c:\users\colto\anaconda3\lib\site-packages (from python-dateutil>=2.8.2->pandas>=0.25->altair) (1.16.0)
# Some fancy visualizations to highlight the words with highest TF-IDF score in each inaugural address
import altair as alt

top_tfidf = tfidf_df.sort_values(by=['document','tfidf'], ascending=[True,False]).groupby(['document']).head(10)

# Terms in this list will get a red dot in the visualization
term_list = ['queer', 'peace']

# adding a little randomness to break ties in term ranking
top_tfidf_plusRand = top_tfidf.copy()
top_tfidf_plusRand['tfidf'] = top_tfidf_plusRand['tfidf'] + np.random.rand(top_tfidf.shape[0])*0.0001

# base for all visualizations, with rank calculation
base = alt.Chart(top_tfidf_plusRand).encode(
    x = 'rank:O',
    y = 'document:N'
).transform_window(
    rank = "rank()",
    sort = [alt.SortField("tfidf", order="descending")],
    groupby = ["document"],
)
# heatmap specification
heatmap = base.mark_rect().encode(
    color = 'tfidf:Q'
)

# red circle over terms in above list
circle = base.mark_circle(size=100).encode(
    color = alt.condition(
        alt.FieldOneOfPredicate(field='term', oneOf=term_list),
        alt.value('red'),
        alt.value('#FFFFFF00')        
    )
)

# text labels, white for darker heatmap colors
text = base.mark_text(baseline='middle').encode(
    text = 'term:N',
    color = alt.condition(alt.datum.tfidf >= 0.23, alt.value('white'), alt.value('black'))
)

# display the three superimposed visualizations
(heatmap + circle + text).properties(width = 600)