What is Ask the News?
Ask the News is a search engine designed to retrieve highly relevant open-source news reports using natural language queries. It also provides a machine generated question & answer (Q&A) response with citations to the most relevant snippets. Ask the News provides opportunities to both identify answers to narrow questions as well as initiate broader data discoverability.
It can be used in a wide range of topics such as Political Visits, International Relations, Global Business, Economic Trends, Current Affairs, and News Reporting
How To Get Started
To interact with Ask the News, navigate to the home page:
You can select from our list of suggested questions, or you can craft a question of your choosing. Please see the FAQ section below for tips and tricks for how to create the best question for Ask the News.
Time Periods
To help refine your results, you have the opportunity to select a time period from the "last 2 hours" to the "last 31 days"
Please note: If you select a short time period "last 2 hours", results may not be returned, depending on the question asked. .
Sources
Ask the News ingests over 50,000 English language news sources
Example Questions
Example questions include:
What is the purpose of Turkish President Recep Tayyip Erdoğan's visit to Russia?
Where has Xi Jinping visited?
What is the purpose of the meeting between Abdel Fattah al-Burhan and Abdel Fattah
al-Sisi?
What technical issue occurred with the United Kingdom's air traffic control system,
causing flight delays?
What social media platform is ranked as the most downloaded app in the world?
How is the global community responding to the humanitarian crisis in Gaza?
Example phrases include:
Zelensky's visit to United States
United States-Nepal Peace Corps agreement
Brics leaders call for comprehensive United Nations reform
Antarctic sea ice decline 2023
Australian union strikes
China's loan prime rate decline
South Korea - United States annual military drills
Re-generating Answers
You can also regenerate the answer to their question to have more options for what format they would like to use in their workflows. To regenerate, select the refresh icon in the top right of the answer block. Once you've generated a new answer(s) you can cycle through the various responses using the arrows in the bottom right of the answer block.
Viewing Search History
You can also view your recent searches (up to 100), viewing the cached answer and document list within browser, and rerunning their old query to get the latest answer. You can delete previous queries from your recent search list, and view the time range and date asked.
Please note: The previous questions asked, and results cached, are stored in browser. This means that if you use a different browser you will not see the past searches, and if you clear your session data in browser it will also clear the search and results history.
Tips and Tricks
Word choice and ordering can produce different results and responses, so it is recommended to experiment with different versions of the same query. Also, consider spelling out all acronyms.
As time passes and new reports are ingested by the system, the result list and answer for the same question may change.
Multiple searches on the same question may produce slightly different answers. More specific questions yield more relevant answers.
If a generative answer is not useful or is not produced for a question, the result list still may have highly relevant reports since the answer is only produced on a subset of the results list.
Sort the result list by date instead of relevance if you want to see the most current results. Currently, this will not change the generative response.
While this feature is designed to use a natural language question as input, it will also retrieve results for input in the form of phrases or sentences.
To optimize generative responses, focus on specifics and add context to queries. Instead of using the broad query “Israel and Hamas war”, add context for what you are searching for, eg “Where are the latest military engagements in the Israel and Hamas war”
Generative outputs are known to be inconsistent, and the same query can lead to different or zero outputs.
FAQs
How Does Ask the News Compare to Boolean Searches?
Unlike boolean searches, which rely on a set of predetermined rules, Primer's semantic searches can interpret and analyze natural language queries, including synonyms, abbreviations, and context, to provide more accurate and comprehensive results.
How are documents retrieved?
At a high-level, a semantic search model is used to transform natural language queries into a semantic embedding, that is then compared to the index of stored semantic embeddings. Step-by-step:
News reports are first ingested into the application
News reports are then separated into sentences
Each sentence is ran through the semantic model and then stored into the index based on their semantic embeddings
Note: A semantic embedding is a numeric representation of the text’s meaning
User queries are also given semantic embeddings
Ask the News then identifies sentences with similar embedding scores to the user’s query
These sentences are highlighted within the application, with added context before and after the sentence from the news report
Finally, the user is presented with a list of the most relevant news reports
How Does the Answer Box Get Generated?
Snippets (i.e., sentences) from the most highly relevant reports are sent to a Q&A model to generate a source-based answer using only those reports. The current implementation leverages ChatGPT 4o. Generally, between five to seven report snippets are used to generate the answer. Primer intentionally trained the Q&A model to be conservative, and it may tell you a response is not producible given the results returned.
As with all generative text, please exercise caution and ensure that the information is factually correct using the references provided. The response should not be copied directly into formal reporting, instead, the answer should be used to provide a quick understanding of the most relevant hits
What limitations should I be aware of?
The generative answer is produced only on the most highly relevant reports, not the entire result list. As a result, the response may not include the most current reports in your result list.
It may not provide a comprehensive result list or generative answer since the data available is limited to a 30-day window of news sources.
It does not include social media
While the models used for search and generating answers are powered by
large-language models, you cannot directly prompt Semantic Search to perform arbitrary tasks (e.g., you cannot ask it to generate a table summarizing all of the locations found in the results list).
Limited source data is ingested so that the application can only use that data to retrieve results. E.g., if you ask for a pro-countryA slogan, and it gives a good answer. Then, if you ask for an anti-countryA slogan, it could give you a very similar answer.
Without knowing the exact details of the query, our assumption is that certain sources could be biased and thus only ingests “pro-country” slogans. When an “anti-country” slogan is used as a query the model is still able to find slogan type of results, but may not necessarily be “anti-country”.
In what ways is this potentially better than an NGT-style of search of news?
Results should have better precision(i.e.,fewer results that are not relevant)and recall (i.e., miss fewer relevant documents) since it is not dependent on keyword search using Boolean operators.
The relevance used to sort results should be more accurate since it is based on the semantic meaning of your natural language query instead of being based on the frequency of keywords from your query found in the resulting documents.
Triaging a result list should be faster since it highlights the specific sections of a report that are relevant to your query instead of presenting the entire document without context.
How should I format my input?
Example questions and phrases have been provided in the “What is it and what can it be used for?” section above. Generally speaking, search results become more relevant as more context is incorporated into the query.
What types of words should I use?
Semantic searches benefit from words that provide context or meaning, e.g. nouns, verbs, adjectives, adverbs, and question words (who, what, when, where, and how).
Do I need proper grammar?
Users do not have to query in complete sentences; however, words need to be spelled correctly to retrieve relevant information.
How long should the input be?
You can type in as little as a few words (phrase) and as much as one to two sentences. Beyond two sentences, semantic search does not perform well because it is unable to effectively capture all of the information in a single vector.
What’s the right degree to break down a question? For example, is it better to input a list of questions in the field or a single question followed up by another and another (general to narrow)?
Start by asking general questions and move to more refined questions
What are the benefits of a specific versus a broad question?
There are no particular benefits to a specific versus a broad question. It will depend on what the user is trying to discover. The ability of the application to discover what is needed is dependent on the data available to it. So, if there is no data that is specific enough to answer the question, it will find other semantically related data and surface that.
Why do I receive inaccurate associations to my queries?
The application will surface information based on the words in the query and their relationships; however, the application may miss the word relationships within more complex queries. For example, “What tanks did Russia use in the war against Ukraine?” will also provide results highlighting tanks used by Ukraine against Russia.
What is the best strategy for asking about more than one event or more than one location? For example, should I submit two separate questions, question 1: Is there a protest in CountryA? And then question 2: Is there a protest in CountryB? Or should I simply ask: Is there a protest in CountryA or CountryB?
Results are better when questions are not bundled, so for the example given, we recommend that you ask two separate questions.
For questions on how to leverage Ask the News or to learn more about Primer's work leveraging LLMs, please reach out to [email protected]


