2023 Research, Creativity & Community Involvement Conference

Permanent URI for this collectionhttps://scholarworks.montana.edu/handle/1/18074

The MSU Billings Research, Creativity & Community Involvement Conference (RCCIC) provides a great opportunity for undergraduate and graduate students of all majors to present their research and creative scholarship in a public forum. The conference is hosted every year on the MSUB campus, sponsored by the Office of Grants and Sponsored Programs, the University Honors Program, and Montana IDeA Networks of Biomedical Research (INBRE). The RCCIC is not a competition, but a celebration of the research and creative projects currently being carried out by MSUB students. All submissions are reviewed and approved by the sponsors prior to presentation or publication to ScholarWorks.

Browse

Search Results

Now showing 1 - 1 of 1
  • Thumbnail Image
    Item
    A Characterization of Search Engine Results
    (Montana State University Billings, 2022) McShane, Elizabeth ; Pannell (Faculty Mentor), John; John Pannell
    Background: According to a Pew Internet Survey, 91% of online adults use some form of web search. While search engine optimization studies are commonly employed by companies to gauge their visibility in search results, few studies have been done to characterize results from the user’s perspective. We wanted to explore the impact search engine choice may have on search results by characterizing top results from several search engines. Aim: Previous research has relied on manual review of search results. Instead of taking this approach, we began developing and testing a set of tools to gather, analyze, and characterize search engine results automatically. Approach: Selenium will be used to run searches and record the top ten organic results. The URLs of the search results will be stripped down to their domains in a python-based program, then categorized using a URL Lookup API. Finally, the results will be analyzed using a python-based program. Results and Conclusions: To date, we have succeeded in gathering search results from Bing, Google, and DuckDuckGo for 50 random search terms and stripped the URLs, leaving the domains. We have also identified a service that provides website categorization, using IAB taxonomy. The development we have done so far has allowed us to identify the following targets for future development. Data Gathering: Some search engines, such as Google, proved difficult to scrape and some irregular results, such as null values, were returned. We would like to explore other methods of web scraping in addition to Selinium and develop several methods that may be able to overcome unique scraping challenges that come with different search engines. In addition, we want to expand the search engines scraped to other, lesser-known search engines. Due to time constraints, the categorization API has not been fully integrated into the program. Thus, automated API integration is another target for future development. We would also like to identify any data, such as advertisements, that we could gather while scraping search results. Data Handling and Storage: In conjunction with the automated API integration, we would like to develop code that removes already-categorized URLs before handling them off to the API for categorization. Additionally, we want to develop error handling for any unusual search results that may pass through the data collection phase. As a final feature, we would like to develop an algorithm that performs basic analysis of the search results.
Copyright (c) 2002-2022, LYRASIS. All rights reserved.