Today Iām excited to launch StreamEA, a Python app with NLP superpowers! šš„
Iāve been busy with that one for a while, Iām so pleased to finally share it with the world!
The app combines the power of the Google Natural Language API with Python Pandas to extract entities from web pages, along with their salience scores!
You only need to upload your Google Language credentials, and you're off!
The app is still in Beta, so your feedback (bug spotting and suggestions) is appreciated! My Twitter DMs are open. :)
Belowās a quick tour of what it does and how to use it.
Step 1 - Upload your GCP credentials
First, you need to upload your JSON key. If you havenāt got one yet, you can follow the instructions here.
Once youāve downloaded your key, upload it (or drag and drop it) in the file uploader - as follows:
Step 2 - Compare 2 URLs
Currently, StreamEA allows you to compare two web pages (bulk upload is coming ;))
You simply need to paste one URL in each field, e.g.:
Some interesting use cases:
Find entities that exist on competitorās pages which outrank you, yet are missing from your pages
Differentiate pages on your website
Research topics discover synonyms, alternative lexical fields
Find how well you've covered a specific topic
Step 3 - Estimate API call costs š°
You can check how much API calls will cost before going ahead. Some useful tidbits regarding pricing:
The usage of the Language API is calculated in āunitsā
1 unit per 1,000 characters
Below's a cost overview - in US dollars:
You can also find more information on how pricing is calculated here.
Step 4 - Send the request to the Google Language API
If youāre happy with the cost, click on āProceedā to send a request to the API:
Note that the app has yet to work for excessively long articles (like this one). Hopefully, Iāll get that sorted soon.
Now here comes the fun part: getting the results! š
Step 5a - Spot the Top 15 missing entities in your content
That section is great to find entities that exist on a competitorās page outranking you, yet are missing in your page.
You'll get two tables:
The left table shows the Top 15 entities in URL 01 not in URL 02.
Similarly, the right table shows the top 15 entities in URL 02 not in URL 01
These entities are sorted by salience scores, so only the 15 most relevant are shown
Don't worry, you can also download *full* lists as CSVs - more on that below.
Step 5b - Check the Master table
The master table gathers *ALL* results from the API call:
A column showing Salience score differences between page 01 and 2 will be added soon.
Step 6 - Export the output data to CSV
Last but not least, you can export these 3 tables independently to CSV:
Shout outs & support
Kudos to BritneyMullerās recent MozCon talk for inspiring me to create this app! Kudos also to Sascha and the Streamlit community, these folks are always here to help!
Lastly, this app is free and should remain that way. Buy me a coffee if itās useful to you! š
Drop me a line if questions, bugs or suggestions!
There seems to be an "application error" when I go to the app. Will it get fixed any time soon?
The tool looks great. But there's an "application error" when I go to the page.
Thanks very much for building this app, I have been working with schema for some time now, and Google's NLP tool even longer, and I have not seen enough SEO's talk about Entities and what that means for Rankings in Google's RankBrain Ai. I made a video way back in 2019, about Googles NLP API and I still love using entities in both my On-Page copy to be comprehensive with diving into an article topic, and inside of my schema markup to nail down "Relevance". As the engineers at Google have said, think in terms of "Things, not strings" especially when it comes to the Knowledge Graph and MREIDs for topical authority.