Brian Carroll | Berry College | Class Homepage
An Introduction to Google-less Web and Database Searches
Readings:
• A World Without Google, The Guardian
• How Does Google Collect and Rank Results
• How Does Google Determine Which Web Sites Are the Most "Trusted"?
• A Primer in Boolean Logic
• If a Tree Doesn't Fall on the Internet, Does It Really Exist?
• To Google or Not to Google? That, dear Brutus, is the question.
Is Google good or evil?What you have to do:
Locate the information requested for each item below. The answer you provide must appear in a database available via Berry Library’s online research sources.
For each response, provide: 1) a brief summary of the answer, 2) a bibliographic citation (author, title, publication, date of publication and (optional) page number/s) of the source, and 3) the database or research system you used to find the source or answer.
You do not have to follow a certain citation format. Simply provide the information requested above. To the hunt:
- Find an article authored by BC (your professor) and provide its bibliographic information, including number of citations (end notes). Hint: What is the primary mass comm database available through Berry?
- A Supreme Court case pertaining to copyright, Metro-Goldwyn-Mayer Studios v. Grokster, was decided in June 2005 by the United States Supreme Court. Who delivered (wrote) the opinion of the Court? What was the vote?
Hint: The legal research tool of LexisNexis allows you to find full information on federal cases. So do Findlaw.com, LexisONE and Oyez.org, among others.- Find the most recent annual revenues for Electronic Arts, a videogame software development company. What region of the world showed the most growth, in percentage terms, for that fiscal year? (North America, Europe, Asia)
Hint: Where are financials reported for public companies? (SEC.gov, LexisNexis SEC)- You are editing a news story about an event that happened in Sevier County, Tennessee. As background for the story you need to know the estimated total population of the county in 2001. According to the U.S. Census Bureau's American Community Survey (2001), what was the estimated total population of Sevier County?
- What is the U.S. Arab population? Is it going up or down?
- What is the air quality in Floyd County?
- Who are our U.S. Senators? How can we reach these people? How did they vote on the bill to authorize stem cell research?
- Who wrote “The limits on University control of graduate student speech” in the Yale Law Journal, March 2003 issue volume 112, no. 5, page 1295?
- What is a viatical settlement?
- Who is the founder of BetOnSports PLC? What crime is he accused of committing?
Glossary of Research Terminology
An aggregator brings together similar resources from different sources. For example, Factiva aggregates news and business sources. (See also: RSS aggregator)
A Weblog (or blog) is a Web page or Web site created with software such as Blogger or Radio Userland. Blogs typically consist of commentary or notes arranged in reverse chronological order so that the newest information appears at the top. First appearing around 1997, blogs cover a range of topics from the personal to the professional and provide an abundance of links, which makes them particularly suited to delivering current information.
The increasing popularity of Weblogs might be attributed to the fact that their authors do not have to know HTML. The software facilitates the creation of the page or site. While it's common for one person to author all entries, groups of people may also contribute.
A controlled vocabulary consists of words and phrases designated to describe the contents of a database. Think of it as an index -- like one you would find in the back of a book. If you know the controlled vocabulary for a particular concept, then it is possible to retrieve all the relevant information in the database regardless of whether the terms appear as keywords in the documents.
Databases hold information in much the same way as airplanes hold passengers. Each seat on an airplane possesses a unique label -- the row and seat number -- that identifies a passenger. In a database, records possess unique identifiers to distinguish the information they contain.
"Finding" tools help you locate information. They include search engines, indexes or subject directories, library catalogs, bibliographic databases and specialty publications available in certain fields of study. Law digests, for instance, help you find case law.
Podcasting refers to a simple XML technology for delivering audio content to players, such as iPods. Note, however, you do not have to use an iPod. Since many podcasts appear in MP3 format, any MP3 compatible player will do. The librarian-managed portal, Free Government Information, compiles an index of podcasts from government sources.
Primary sources originate information. In the legal field, primary sources make the law. Congress, state legislatures, town councils, courts and government agencies all constitute primary sources of law.
In general, primary sources include people, institutions and organizations, government agencies, advocacy groups and other entities that create original material or generate data.
Working with primary sources can be difficult. Sometimes it's impossible to verify primary information, as the originator is the publisher (or producer). Sometimes it's difficult to digest primary information because it is in its original form and not yet interpreted (e.g., court decision (primary) versus a legal article about the decision (secondary)).
Proximity connectors are search commands (syntax) used to specify a relationship between two or more keywords. For instance, if ADJ stands for adjacent, and you enter the search statement, wall ADJ street, you are instructing the search engine to retrieve all instances of the word, wall when it appears adjacent to the word street. Other examples of proximity connectors include S (same sentence), P (same paragraph), NEARn, W/n (within a certain number of) and PREn (immediately preceding). There are numerous variations of these examples.
Research systems store, or provide access to, two or more databases. Factiva and LexisNexis (available via UNC Databases), for example, are research systems. They provide access to thousands of databases consisting primarily of groups of (or individual) publications (e.g., The Wall Street Journal, regional and national newspapers).
RSS, or Really Simple Syndication, is simple XML (eXtensible markup language) coding that enables syndicating content. Pioneered by Netscape, RSS delivers headlines, content summaries or full-text content, depending on the whim of the author. While it's particularly suited to delivering current information, the reader needs--with some exceptions--additional software (called newsreaders or aggregators) to display the feed. Web-based aggregators such as Bloglines may some day eliminate the need for additional software.
Weblog software, such as Blogger or Radio Userland, typically allows users to create RSS feeds automatically. However, you should be aware that several RSS versions exist; and not all versions are compatible with all aggregators.
An RSS aggregator is software or a Web service that pulls together XML-based news feeds (RSS, Atom, XML) from different sources. Also called RSS reader.
Search engines power databases like jet engines propel airplanes. Just as there are different jet engine models for aircrafts, there are different search engines for databases. Almost all have the force of a supercharged jet engine, but you might not always experience this power. Search engines such as Google, Yahoo and MSN Search possess more capabilities than the average searcher experiences. This may make them appear inferior to traditional research systems such as LexisNexis, when in fact they employ equal or superior technology. Key differences important to researchers have more to do with the quality and condition of the data residing in the search engine database than the search technology.
A search statement is the query you enter in a database or search engine.
Secondary sources of information offer an explanation or analysis. They include books, journal articles, newspapers, encyclopedias and other explanatory materials.
SGML (Standard Generalized Markup Language) is a publishing standard adopted by the International Organization for Standardization (ISO) in 1986. It enables platform- and application-independent documents, which retain their formatting, indexing and linked information.
Syntax is the search logic or command/s used to retrieve information. Each search engine utilizes its own syntax. Since there are no industry standards, making the best use of a search tool requires learning its language.
Unstructured information is not defined in a way that makes it easily retrievable by any one of its components. An article appearing on the Web in plain old HTML has few, if any, defined components. HTML allows for a "title" definition. It also provides for some other components -- author, copyright owner and subject tags, for example. But search engines largely ignore these pieces.
Derived from SGML and developed by the World Wide Web Consortium (W3C), XML (eXtensible markup language) is a specification that lets you create your own markup language for structuring documents and data on the Web.