QAWiki:Guide/Questions

From QAWiki
Jump to navigation Jump to search

Guidelines for creating questions

Here are some tips for selecting and writing questions.

Selecting questions

The focus of QAWiki is currently on collecting factual-style questions that are answerable on knowledge bases such as Wikidata. As such, subjective questions such as "What is the best movie directed by Kubrick?", procedural questions such as "How can I tie a tie?", or advice-seeking questions like "How can I save money on flights?", etc., are not currently the focus. However, it might be possible to define some related objective questions in such cases, e.g., "What movies directed by Stanley Kubrik won the most Academy Awards?".

Question labels and aliases

The main label for the question should be quite specific, while aliases can provide more informal or concise ways to ask a question. A good example might be the main label "What movies directed by Stanley Kubrik won the most Academy Awards?", which provides the full name of the director and explicitly mentions the director relation. On the other hand, an alias like "Which Kubrick movie won the most Oscars?" intuitively captures the same intent, but less explicitly: only the director's last name is given, the director relation is implied, and the informal name "Oscars" is used instead of the more official "Academy Awards".

Questions for language variants

In general, we will assume that language variants inherit from the more general language code, and that there is no need to duplicate questions, aliases, etc., from the general language to a language variant. For example, for the question "Which Kubrick movie won the most Oscars? (en)", there is no need to duplicate the question to, e.g., (en-UK). However, in the case of What are the colors of the French flag? (en), adding What are the colours of the French flag? (en-UK) is welcome to capture the "colour" spelling variant.

Guidelines for linking questions

Here are some relations that you can define between questions to help guide users of systems built upon QAWiki.

Ambiguous questions

Ambiguous questions like "What is the largest country in Africa?" are welcome, but they should be tagged as instance of the class ambiguous question and should not provide a query. Ideally, each ambiguous question should link to one or more candidate questions that the user might mean, such as "What is the largest country in Africa by population?" and "What is the largest country in Africa by area?". Systems using QAWiki can then suggest to the user these more specific questions in a "Did you mean ..." style interaction. Other types of ambiguity might arise due to entities mentioned in the question being ambiguous. To avoid a proliferation of disambiguation, it can be assumed that the most "obvious" entity in the context of the question applies in general, much like in Wikpedia, where Michael Jackson points directly to the famous singer; however, there may be cases where there is no one "obvious" entity being referred to, in which case an ambiguous question can be generated.

Analogous questions

The goal of QAWiki is to create a diverse collection of questions and queries. In general, we do not recommend generating many instance of analogous questions, such as "What is the capital of X?" for many values of X. A single instance, such as "What is the capital of Ireland?" is often sufficient, where other examples can be generated automatically. However, there may be cases where it is useful to have several manually generated instances of an analogous query (that vary only in their particular selection of entities); this may be due to having different types of entities (e.g., the capital of a state vs. country), to capture different subtleties (e.g., the gender of the entity and how it affects the expression of the question), etc. If one wishes to add many specific instances of a question (whose query varies only in its subject/object entities in a one-to-one manner), we recommend using the analogous question property to link these instances; for example, see "When did Stan Lee die?" linked with "When did Ian Curtis die?". This should not be used where the underlying intent of the question is different; for example "What is the capital of IBM?" may refer to something different (working capital of a business) and would thus not be analogous to the capital of a country, etc. (the query would require a different property).

Contingent and substitute questions

Some questions might be built upon false assumptions such that an answer to the question (even if blank) might be misleading. For example, if we answer the question "What age is Stan Lee?" based on subtracting his date of birth from the current date, this might falsely give the impression that he is still alive. To avoid misunderstandings, we can add this question, but link it to a contingent question "Is Stan Lee alive?", along with an expected answer (in this case, we expect the answer to be true). We can further add qualifiers to suggest zero or more substitute questions in case the contingent question fails; for example, in case that Stan Lee is not alive, and a user asks his age, we can rather suggest to ask "What age was Stan Lee when he died?" or "What age would Stan Lee be if he were still alive?", thus avoiding misunderstandings. We recommend that contingent questions be added even if the specific entities satisfy the assumptions. For example, the question "What age is Justin Bieber?" includes a contingent question, even though Justin Bieber is indeed alive.

Broader/narrower questions

When a user asks an initial question, they might find that the results are too broad or too narrow for their liking. To assist with this, QAWiki allows broader or narrower relationships to be defined between questions in order to suggest follow-up questions to users. These relations are intended to have a specific meaning. We say that question A is narrower than question B if (and only if) any answer to question A must also be an answer to question B, irrespective of the current data. We say that question A is broader than question B if (and only if) question B is narrower than question A. (Formally this corresponds to the notion of query containment.) As an example, a user might first ask "What species are in the order Monotremata?". Upon receiving the answers, they might notice that a lot of the species returned are extinct. As a follow-up, they might thus ask a narrower question like "What extant species are in the order Monotremata?" (or "What extinct species are in the order Monotremata?"). When adding these questions, we can define a narrower relation from the original question to the latter more specific questions, and a broader relation from the latter questions to the original question. Another example is the question Which science fiction movies star Uma Thurman?, which is defined to be narrower than the question Which movies star Uma Thurman?.

Negated questions

We can also indicate questions that are the negation of other questions. This specifically relates to questions with a boolean (yes/no) answer, where a yes for a question should imply a no for the negated question, and vice versa; for example "Is Stan Lee alive?" vs. "Is Stan Lee dead?".