Clustering Search Volumes With KMeans

So I’m sure you’re totally interested in what I do at work BUT this is a cute, pretty and clever… I think!

So the notebook is here on my github but here are some of the highlights that I’m really happy with.

Continue reading “Clustering Search Volumes With KMeans”

Advertisements

Comparing Corpora using Frequency Profiling – Rayson & Garside

If you want to learn how to do a technique then it might be an idea to check the source of the technique in the first place. Whilst Rayson and Garside didn’t invent the technique, they perfected it! In the last post I explained how I implemented their work, this post is all about the ins and outs of their paper that has been cited a huge 492 times!

Rayson, P., & Garside, R. (2000, October). Comparing corpora using frequency profiling. In Proceedings of the workshop on Comparing Corpora(pp. 1-6). Association for Computational Linguistics.

Continue reading “Comparing Corpora using Frequency Profiling – Rayson & Garside”

Masculinities in Cyberspace – Schmitz & Kazyak

You know me, I’m fascinated by masculinities online and when I came across this citation I just couldn’t resist! I’m usually a stickler for methodology in gender research but this paper really got me thinking. I’ll admit it’s not my perfect cup of tea…

But it’s pretty close!

Schmitz, R. M., & Kazyak, E. (2016). Masculinities in Cyberspace: An Analysis of Portrayals of Manhood in Men’s Rights Activist Websites. Social Sciences5(2), 18.

Continue reading “Masculinities in Cyberspace – Schmitz & Kazyak”

Corpus Anotation

On the back of the corpus chapter that I read through here, I thought that I would pick up an old project that I might explain in another post. Long story short, I wanted to try to build a system that will take input text and return innuendo. I chose innuendo as a form of humour because of seeming ease that anything can be twisted meaning training material for the system would be fruitful.

Continue reading “Corpus Anotation”

Corpus Linguistics – Tony McEnery

I went into this chapter (24 in the Oxford Handbook of Computational Linguistics) to answer a question that motivated me to get the book in the first place: “How should I extract a quantitive proof from a corpus?”. Unfortunately, it didn’t answer this question but it did provide a great jumping off point for further research.

Mitkov, R. (2005). The Oxford handbook of computational linguistics. Oxford University Press.

Continue reading “Corpus Linguistics – Tony McEnery”