A Social Science Discovery Engine

Wed, Jul 22, 2020 3-minute read

I like to read social science papers, so I like to be able to find them easily. One of my favorite aggregators of papers - not just in economics, but in other social sciences as well - is the Social Science Research Network. Most of the time, SSRN - and other aggregators/indexers like Google Scholar or ArXiV - helps me find well-known papers matching whatever specification I want. But I’ve always had a nagging question: what papers am I not seeing? What are the papers that are pushed down by search engines?

One possible answer is bad papers. In other words, I’m simply seeing the highest-quality papers and it’s a waste of time to look for the papers that are deprioritized by search engines. This is certainly possible, but:

  1. Search results tend to show a tiny fraction of papers in a space: even if the median academic paper isn’t very high quality, it’s hard to argue that 99% of papers are so bad that it’s not worth reading them.

  2. Particularly in economics, there’s a tyranny of top journals (Heckman and Moktan 2020) that leads to crowding out of research outside of top journals, where research in top journals is cited disproportionately. Combining this with the rich-get-richer effect of Google Scholar citations (papers with more citations are pushed to the top, and thus get more citations) gives a clear argument that money is being left on the table in terms of high-quality research that isn’t being consulted.

In short, search engines perform a very valuable function, but they fail at performing a different valuable function: discovering papers other than top journal/very popular/highly-downloaded papers. What I want is not a search engine, but a discovery engine.

I couldn’t find one, so I made one!

I am in the process of crawling SSRN to collect the details of every paper on it into a CSV file. The crawling is still in progress and will take a while to debug, but with the preliminary dataset I made a web app that displays the details of a random SSRN paper matching your specification. If you don’t enter a search keyword, it will instead give you a totally random SSRN paper. In coming weeks I will develop the app to be more user-friendly and flexible, and I’ll build out the database of SSRN papers to be more complete. But for now, I have a working discovery engine.

References

Heckman, James J, and Sidharth Moktan. 2020. “Publishing and Promotion in Economics: The Tyranny of the Top Five.” Journal of Economic Literature 58 (2): 419–70.