Connect with us

AI

When Did Beyoncé Start Becoming Popular? – Tackling One of the Most Common Problems in NLP: Q/A

Published

on

image

Valentin Biryukov Hacker Noon profile picture

@biryukovValentin Biryukov

Head of R&D at Toloka.ai

Hello! Today I’d like to explain how to solve one of the most troublesome tasks in NLP — question answering. We’ll be labeling the SQuAD2.0 dataset with the help of Toloka-Kit — a Python library for data labeling projects that helps data scientists and ML engineers build scalable ML pipelines. But feel free to go with a different option, like Vertex AI, for instance. Let’s dive right in.

What is SQuAD?

The Stanford Question Answering Dataset (SQuAD) is used to test NLP models and their ability to understand natural language. SQuAD2.0 consists of a set of paragraphs from Wikipedia articles, along with 100,000 question-answer pairs derived from these paragraphs, and 50,000 unanswerable questions. To show good results on SQuAD2.0, a model must not only answer questions correctly, but also determine whether a question has an answer in the first place, and refrain from responding if it doesn’t.

SQuAD2.0 is the most popular question answering dataset: it’s been cited in over 1000 articles, and in the three years since its release, 85 models have been published on its leaderboard.

The Problem

Our task is to get the correct answer to a question based on a fragment of a Wikipedia article. The answer is a segment of text from the corresponding passage, or the question may not have an answer at all. Here’s an example of text, question, and answer:

Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny’s Child. Managed by her father, Mathew Knowles, the group became one of the world’s best-selling girl groups of all time. Their hiatus saw the release of Beyoncé’s debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles “Crazy in Love” and “Baby Boy”.

question: When did Beyonce start becoming popular?

answer: [in the late 1990s]

Let’s Talk about Crowdsourcing 

Crowdsourcing can be extremely useful in solving Q&A tasks. If you’re building a virtual assistant, a chatbot, or any other system that’s supposed to answer questions posed in natural language, you need to train your model on a dataset like SQuAD2.0. But using an open dataset is not always an option (for instance, there may be nothing available in the language you’re working with). You can use crowdsourcing to build your own dataset and make your labeling process easier.

The Solution

Let’s create two projects for our labeling pipeline:

  1. Marking project — we will collect answers to the questions from the test dataset
  2. Verification project — we will verify these answers to improve the final quality
token = input("Enter your token:")
if token == '': print('The token you entered may be invalid. Please try again.')
else: print('OK')
# Prepare an environment and import everything we need
!pip install toloka-kit==0.1.3 import datetime
import json
import time import toloka.client as toloka
import toloka.client.project.template_builder as tb
# Create a Toloka client instance
# All API calls will pass through it
toloka_client = toloka.TolokaClient(token, 'PRODUCTION') # or switch to SANDBOX # We check the money available in your account, which also checks the validity of the OAuth token
requester = toloka_client.get_requester()
# How much money do you need for one question
PRICE_PER_TASK = 0.2
tasks_num = int(input("Enter the number of questions:"))
print('You have enough money on your account - ', requester.balance >= tasks_num * PRICE_PER_TASK)
# Download datasets
!curl https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json --output train-v2.0.json
!curl https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json --output dev-v2.0.json with open('dev-v2.0.json') as f: data = json.load(f) with open('train-v2.0.json') as f: train_data = json.load(f)

Review the dataset

Our dataset is the collection of texts and questions with a list of possible answers to them.

{ 'title': data['data'][0]['title'], # Printing only the first paragraph for review 'paragraphs': [data['data'][0]['paragraphs'][0]]
}
{'title': 'Normans', 'paragraphs': [{'qas': [{'question': 'In what country is Normandy located?', 'id': '56ddde6b9a695914005b9628', 'answers': [{'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}, {'text': 'France', 'answer_start': 159}], 'is_impossible': False}, {'question': 'When were the Normans in Normandy?', 'id': '56ddde6b9a695914005b9629', 'answers': [{'text': '10th and 11th centuries', 'answer_start': 94}, {'text': 'in the 10th and 11th centuries', 'answer_start': 87}, {'text': '10th and 11th centuries', 'answer_start': 94}, {'text': '10th and 11th centuries', 'answer_start': 94}], 'is_impossible': False}, {'question': 'From which countries did the Norse originate?', 'id': '56ddde6b9a695914005b962a', 'answers': [{'text': 'Denmark, Iceland and Norway', 'answer_start': 256}, {'text': 'Denmark, Iceland and Norway', 'answer_start': 256}, {'text': 'Denmark, Iceland and Norway', 'answer_start': 256}, {'text': 'Denmark, Iceland and Norway', 'answer_start': 256}], 'is_impossible': False}, {'question': 'Who was the Norse leader?', 'id': '56ddde6b9a695914005b962b', 'answers': [{'text': 'Rollo', 'answer_start': 308}, {'text': 'Rollo', 'answer_start': 308}, {'text': 'Rollo', 'answer_start': 308}, {'text': 'Rollo', 'answer_start': 308}], 'is_impossible': False}, {'question': 'What century did the Normans first gain their separate identity?', 'id': '56ddde6b9a695914005b962c', 'answers': [{'text': '10th century', 'answer_start': 671}, {'text': 'the first half of the 10th century', 'answer_start': 649}, {'text': '10th', 'answer_start': 671}, {'text': '10th', 'answer_start': 671}], 'is_impossible': False}, {'plausible_answers': [{'text': 'Normans', 'answer_start': 4}], 'question': "Who gave their name to Normandy in the 1000's and 1100's", 'id': '5ad39d53604f3c001a3fe8d1', 'answers': [], 'is_impossible': True}, {'plausible_answers': [{'text': 'Normandy', 'answer_start': 137}], 'question': 'What is France a region of?', 'id': '5ad39d53604f3c001a3fe8d2', 'answers': [], 'is_impossible': True}, {'plausible_answers': [{'text': 'Rollo', 'answer_start': 308}], 'question': 'Who did King Charles III swear fealty to?', 'id': '5ad39d53604f3c001a3fe8d3', 'answers': [], 'is_impossible': True}, {'plausible_answers': [{'text': '10th century', 'answer_start': 671}], 'question': 'When did the Frankish identity emerge?', 'id': '5ad39d53604f3c001a3fe8d4', 'answers': [], 'is_impossible': True}], 'context': 'The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland and Norway who, under their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct cultural and ethnic identity of the Normans emerged initially in the first half of the 10th century, and it continued to evolve over the succeeding centuries.'}]}

Create a new marking project

In this project, performers will try to find answers to the questions. If this is not possible, they should mark the question as unanswerable or paste the answer otherwise.

# How performers will see the task
radio_group_field = tb.fields.RadioGroupFieldV1( data=tb.data.OutputData(path='is_possible'), label='Does the text contain an asnwer?', validation=tb.conditions.RequiredConditionV1(), options=[ tb.fields.GroupFieldOption(label='Yes', value='yes'), tb.fields.GroupFieldOption(label='No', value='no') ]
)
helper = tb.helpers.IfHelperV1( condition=tb.conditions.EqualsConditionV1( to='yes', data=tb.data.OutputData(path='is_possible') ), then=tb.fields.TextareaFieldV1( data=tb.data.OutputData(path='answer'), label='Paste an answer', validation=tb.conditions.RequiredConditionV1() )
) project_interface = toloka.project.view_spec.TemplateBuilderViewSpec( config=tb.TemplateBuilder( view=tb.view.ListViewV1( items=[ tb.view.TextViewV1(label='Text', content=tb.data.InputData(path='text')), tb.view.TextViewV1(label='Question', content=tb.data.InputData(path='question')), tb.view.ListViewV1(items=[radio_group_field, helper]) ] ) )
) public_instruction = open('marking_public_instruction.html').read().strip() # Set up the project
marking_project = toloka.project.Project( assignments_issuing_type=toloka.project.Project.AssignmentsIssuingType.AUTOMATED, public_name='Find the answer in the text', public_description='Read the text and find the text fragment that answers the question', public_instructions=public_instruction, # Set up the task: view, input, and output parameters task_spec=toloka.project.task_spec.TaskSpec( input_spec={ 'text': toloka.project.field_spec.StringSpec(), 'question': toloka.project.field_spec.StringSpec(), 'question_id': toloka.project.field_spec.StringSpec(required=False) }, output_spec={ 'answer': toloka.project.field_spec.StringSpec(required=False), 'is_possible': toloka.project.field_spec.StringSpec(allowed_values=['yes', 'no']) }, view_spec=project_interface, ),
) # Call the API to create a new project
# If you have already created all pools and projects you can just get it using toloka_client.get_project('your marking project id')
marking_project = toloka_client.create_project(marking_project)
print(f'Created marking project with id {marking_project.id}')
print(f'To view the project, go to: https://toloka.yandex.com/requester/project/{marking_project.id}')
image
How performers will see the tasks
image
How performers see the instructions

Marking training

Then we want to create training to help performers make the tasks better. We will add several training tasks and require to complete all of them before performing the real tasks.

# Set up the training pool
marking_training = toloka.training.Training( project_id=marking_project.id, private_name='SQUAD2.0 training', may_contain_adult_content=True, assignment_max_duration_seconds=10000, mix_tasks_in_creation_order=True, shuffle_tasks_in_task_suite=True, training_tasks_in_task_suite_count=3, task_suites_required_to_pass=1, retry_training_after_days=1, inherited_instructions=True, public_instructions='',
) marking_training = toloka_client.create_training(marking_training)
print(f'Created training with id {marking_training.id}')
print(f'To view the training, go to: https://toloka.yandex.com/requester/project/{marking_project.id}/training/{marking_training.id}')

We need to upload tasks for training with hints to help performers find the correct answers.

training_tasks = [ toloka.task.Task( input_values={ 'question_id': '56be85543aeaaa14008c9063', 'question': 'When did Beyonce start becoming popular?', 'text': 'Beyoncé Giselle Knowles-Carter (/biːˈjɒnseɪ/ bee-YON-say) (born September 4, 1981) is an American singer, songwriter, record producer and actress. Born and raised in Houston, Texas, she performed in various singing and dancing competitions as a child, and rose to fame in the late 1990s as lead singer of R&B girl-group Destiny's Child. Managed by her father, Mathew Knowles, the group became one of the world's best-selling girl groups of all time. Their hiatus saw the release of Beyoncé's debut album, Dangerously in Love (2003), which established her as a solo artist worldwide, earned five Grammy Awards and featured the Billboard Hot 100 number-one singles "Crazy in Love" and "Baby Boy".' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_possible': 'yes', 'answer': 'in the late 1990s'})], message_on_unknown_solution='the answer can be found after "and rose to fame..."', infinite_overlap=True, pool_id=marking_training.id ), toloka.task.Task( input_values={ 'question_id': '56be86cf3aeaaa14008c9076', 'question': 'After her second solo album, what other entertainment venture did Beyonce explore?', 'text': 'Following the disbandment of Destiny's Child in June 2005, she released her second solo album, B'Day (2006), which contained hits "Déjà Vu", "Irreplaceable", and "Beautiful Liar". Beyoncé also ventured into acting, with a Golden Globe-nominated performance in Dreamgirls (2006), and starring roles in The Pink Panther (2006) and Obsessed (2009). Her marriage to rapper Jay Z and portrayal of Etta James in Cadillac Records (2008) influenced her third album, I Am... Sasha Fierce (2008), which saw the birth of her alter-ego Sasha Fierce and earned a record-setting six Grammy Awards in 2010, including Song of the Year for "Single Ladies (Put a Ring on It)". Beyoncé took a hiatus from music in 2010 and took over management of her career; her fourth album 4 (2011) was subsequently mellower in tone, exploring 1970s funk, 1980s pop, and 1990s soul. Her critically acclaimed fifth studio album, Beyoncé (2013), was distinguished from previous releases by its experimental production and exploration of darker themes.' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_possible': 'yes', 'answer': 'acting'})], message_on_unknown_solution='the answer can be found before "... with a Golden Globe-nominated performance in Dreamgirls (2006), and starring roles in The Pink Panther (2006) and Obsessed (2009)"', infinite_overlap=True, pool_id=marking_training.id ), toloka.task.Task( input_values={ 'question_id': '5a8d7bf7df8bba001a0f9ab1', 'question': 'What category of game is Legend of Zelda: Australia Twilight?', 'text': 'The Legend of Zelda: Twilight Princess (Japanese: ゼルダの伝説 トワイライトプリンセス, Hepburn: Zeruda no Densetsu: Towairaito Purinsesu?) is an action-adventure game developed and published by Nintendo for the GameCube and Wii home video game consoles. It is the thirteenth installment in the The Legend of Zelda series. Originally planned for release on the GameCube in November 2005, Twilight Princess was delayed by Nintendo to allow its developers to refine the game, add more content, and port it to the Wii. The Wii version was released alongside the console in North America in November 2006, and in Japan, Europe, and Australia the following month. The GameCube version was released worldwide in December 2006.[b]' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_possible': 'no'})], message_on_unknown_solution='There is no game called Legend of Zelda: Australia Twilight', infinite_overlap=True, pool_id=marking_training.id )
] tasks_op = toloka_client.create_tasks_async(training_tasks)
toloka_client.wait_operation(tasks_op)

Marking pool

Now we need to create a pool with real tasks.

We want to have manual solutions acceptance (based on the results of the verification projects) and some overlap to have multiple variants of answers for every question.

We want to filter performers by their knowledge of English and the result of the training.

Also we want to set up the quality control:

  1. We want to ban performers who answer too fast
  2. We want to ban performers based on low quality on the golden set tasks
  3. We want to increase overlap for the task if the assignment was rejected
marking_pool = toloka.pool.Pool( project_id=marking_project.id, private_name='Pool 1', may_contain_adult_content=True, will_expire=datetime.datetime.utcnow() + datetime.timedelta(days=365), reward_per_assignment=0.02, auto_accept_solutions=False, auto_accept_period_day=3, assignment_max_duration_seconds=60*20, defaults=toloka.pool.Pool.Defaults( default_overlap_for_new_task_suites=3 ), filter=toloka.filter.Languages.in_('EN'),
) marking_pool.set_mixer_config(real_tasks_count=4, golden_tasks_count=1, training_tasks_count=0) # 5 tasks per page # We require at least 1 training task to be completed on the first attempt
marking_pool.quality_control.training_requirement=toloka.quality_control.QualityControl.TrainingRequirement(training_pool_id=marking_training.id, training_passing_skill_value=30) # Increase overlap for the task if the assignment was rejected
marking_pool.quality_control.add_action( collector=toloka.collectors.AssignmentsAssessment(), conditions=[toloka.conditions.AssessmentEvent == toloka.conditions.AssessmentEvent.REJECT], action=toloka.actions.ChangeOverlap(delta=1, open_pool=True)
) # Ban performer if its quality in the binary classification of the existence of the answer is lower than for a random choice
marking_pool.quality_control.add_action( collector=toloka.collectors.GoldenSet(), conditions=[ toloka.conditions.GoldenSetCorrectAnswersRate < 50.0, toloka.conditions.GoldenSetAnswersCount > 4 ], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration=1, duration_unit=toloka.user_restriction.DurationUnit.DAYS, private_comment='Golden set' )
) # Ban performer who answers too fast
marking_pool.quality_control.add_action( collector=toloka.collectors.AssignmentSubmitTime(history_size=5, fast_submit_threshold_seconds=120), conditions=[toloka.conditions.FastSubmittedCount > 2], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration_unit=toloka.user_restriction.DurationUnit.PERMANENT, private_comment='Fast responses' )
) # Another criteria to ban performer who answers too fast
marking_pool.quality_control.add_action( collector=toloka.collectors.AssignmentSubmitTime(fast_submit_threshold_seconds=60), conditions=[toloka.conditions.FastSubmittedCount > 0], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration_unit=toloka.user_restriction.DurationUnit.PERMANENT, private_comment='Fast responses' )
) marking_pool = toloka_client.create_pool(marking_pool)
print(f'Created pool with id {marking_pool.id}')
print(f'To view the pool, go to: https://toloka.yandex.com/requester/project/{marking_project.id}/pool/{marking_pool.id}')

Let’s generate tasks from the test dataset and golden tasks from the training dataset. In the golden set we will compare only binary yes/no classification of the answer because it’s possible to have several different correct answers to the questions so we can’t directly compare them with the performer’s answer.

for d in train_data['data']: if len(golden_tasks) == tasks_num / 2: break for paragraph in d['paragraphs']: if len(golden_tasks) == tasks_num / 2: break for question in paragraph['qas']: if len(golden_tasks) == tasks_num / 2: break golden_tasks.append( toloka.task.Task( input_values={ 'text': paragraph['context'], 'question': question['question'], 'question_id': question['id'] }, known_solutions = [toloka.task.BaseTask.KnownSolution(output_values={'is_possible': 'no' if question['is_impossible'] else 'yes'})], pool_id = marking_pool.id ) ) tasks = []
for d in data['data']: if len(tasks) >= tasks_num: break for paragraph in d['paragraphs']: if len(tasks) >= tasks_num: break for question in paragraph['qas']: if len(tasks) == tasks_num: break tasks.append( toloka.task.Task( input_values={ 'text': paragraph['context'], 'question': question['question'], 'question_id': question['id'] }, pool_id = marking_pool.id, ) )
# Restrict size of the golden set and create tasks
tasks_op = toloka_client.create_tasks_async(golden_tasks + tasks, allow_defaults=True)
toloka_client.wait_operation(tasks_op)

Verification project

Our second project is about verification of the answers. Performer should read the text and the question and check the correctness of the suggested answer.

# How performers will see the task
helper = tb.helpers.IfHelperV1( condition=tb.conditions.EqualsConditionV1(to='yes', data=tb.data.InputData(path='is_possible')), then=tb.view.TextViewV1(label='Answer', content=tb.data.InputData(path='answer')), else_=tb.view.TextViewV1(label='Answer', content='No answer in the text')
)
radio_group_field = tb.fields.RadioGroupFieldV1( data=tb.data.OutputData(path='is_correct'), label='Is the answer correct?', validation=tb.conditions.RequiredConditionV1(), options=[ tb.fields.GroupFieldOption(label='Yes', value='yes'), tb.fields.GroupFieldOption(label='No', value='no') ]
)
verificaction_project_interface = toloka.project.view_spec.TemplateBuilderViewSpec( config=tb.TemplateBuilder( view=tb.view.ListViewV1( items=[ tb.view.TextViewV1(label='Text', content=tb.data.InputData(path='text')), tb.view.TextViewV1(label='Question', content=tb.data.InputData(path='question')), helper, radio_group_field ] ) )
) public_instruction = open('verification_public_instruction.html').read().strip() # Set up the project
verification_project = toloka.project.Project( assignments_issuing_type=toloka.project.Project.AssignmentsIssuingType.AUTOMATED, public_name='Check if the answer is correct', public_description='Read the text, the question, and the answer. Check if the answer is correct', public_instructions=public_instruction, # Set up the task: view, input, and output parameters task_spec=toloka.project.task_spec.TaskSpec( input_spec={ 'text': toloka.project.field_spec.StringSpec(), 'question': toloka.project.field_spec.StringSpec(), 'question_id': toloka.project.field_spec.StringSpec(required=False), 'assignment_id': toloka.project.field_spec.StringSpec(required=False), 'answer': toloka.project.field_spec.StringSpec(required=False), 'is_possible': toloka.project.field_spec.StringSpec(allowed_values=['yes', 'no']) }, output_spec={'is_correct': toloka.project.field_spec.StringSpec(allowed_values=['yes', 'no'])}, view_spec=verificaction_project_interface, ),
)
verification_project = toloka_client.create_project(verification_project)
print(f'Created verification project with id {verification_project.id}')
print(f'To view the project, go to: https://toloka.yandex.com/requester/project/{verification_project.id}')
image
How performers see the tasks
image
How performers see the instructions

Verification training

Training is necessary for this project because it is hard to get a golden set (there is no source to get examples of correct/incorrect answers). So, we should create training with different types of the answers to prepare performers for a variety of possible tasks and filter performers who will complete it poorly.

verification_training = toloka.training.Training( project_id=verification_project.id, private_name='SQUAD2.0 training', may_contain_adult_content=True, assignment_max_duration_seconds=10000, mix_tasks_in_creation_order=True, shuffle_tasks_in_task_suite=True, training_tasks_in_task_suite_count=5, task_suites_required_to_pass=1, retry_training_after_days=1, inherited_instructions=True, public_instructions='',
) verification_training = toloka_client.create_training(verification_training)
print(f'Created training with id {verification_training.id}')
print(f'To view the training, go to: https://toloka.yandex.com/requester/project/{verification_project.id}/training/{verification_training.id}')

Let’s create some different tasks to cover as many possible correct/incorrect answer options as possible.

training_tasks = [ toloka.task.Task( input_values={ 'question_id': '', 'question': 'Who wrote later papers studying problems solvable by Turning machines?', 'answer': 'Hisao Yamada', 'is_possible': 'yes', 'text': 'Earlier papers studying problems solvable by Turing machines with specific bounded resources include John Myhill's definition of linear bounded automata (Myhill 1960), Raymond Smullyan's study of rudimentary sets (1961), as well as Hisao Yamada's paper on real-time computations (1962). Somewhat earlier, Boris Trakhtenbrot (1956), a pioneer in the field from the USSR, studied another specific complexity measure. As he remembers:' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_correct': 'no'})], message_on_unknown_solution='The text is about earlier papers not later ones', infinite_overlap=True, pool_id=verification_training.id ), toloka.task.Task( input_values={ 'question_id': '', 'question': 'Who wrote the paper "Reductibility Among Combinatorial Problems" in 1974?', 'answer': 'Richard Karp', 'is_possible': 'yes', 'text': 'In 1967, Manuel Blum developed an axiomatic complexity theory based on his axioms and proved an important result, the so-called, speed-up theorem. The field really began to flourish in 1971 when the US researcher Stephen Cook and, working independently, Leonid Levin in the USSR, proved that there exist practically relevant problems that are NP-complete. In 1972, Richard Karp took this idea a leap forward with his landmark paper, "Reducibility Among Combinatorial Problems", in which he showed that 21 diverse combinatorial and graph theoretical problems, each infamous for its computational intractability, are NP-complete.' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_correct': 'no'})], message_on_unknown_solution='"Reductibility Among Combinatorial Problems" was written in 1972', infinite_overlap=True, pool_id=verification_training.id ), toloka.task.Task( input_values={ 'question_id': '', 'question': 'What category of game is Legend of Zelda: Australia Twilight?', 'answer': '', 'is_possible': 'no', 'text': 'The Legend of Zelda: Twilight Princess (Japanese: ゼルダの伝説 トワイライトプリンセス, Hepburn: Zeruda no Densetsu: Towairaito Purinsesu?) is an action-adventure game developed and published by Nintendo for the GameCube and Wii home video game consoles. It is the thirteenth installment in the The Legend of Zelda series. Originally planned for release on the GameCube in November 2005, Twilight Princess was delayed by Nintendo to allow its developers to refine the game, add more content, and port it to the Wii. The Wii version was released alongside the console in North America in November 2006, and in Japan, Europe, and Australia the following month. The GameCube version was released worldwide in December 2006.[b]' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_correct': 'yes'})], message_on_unknown_solution='There is no game called Legend of Zelda: Australia Twilight', infinite_overlap=True, pool_id=verification_training.id ), toloka.task.Task( input_values={ 'question_id': '', 'question': 'What is the name of the state that the megaregion expands to in the east?', 'answer': 'Las Vegas', 'is_possible': 'yes', 'text': 'The 8- and 10-county definitions are not used for the greater Southern California Megaregion, one of the 11 megaregions of the United States. The megaregion's area is more expansive, extending east into Las Vegas, Nevada, and south across the Mexican border into Tijuana.' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_correct': 'no'})], message_on_unknown_solution='The state is actually called Nevada', infinite_overlap=True, pool_id=verification_training.id ), toloka.task.Task( input_values={ 'question_id': '', 'question': 'Which city is the most populous in California?', 'answer': 'Los Angeles', 'is_possible': 'yes', 'text': 'Within southern California are two major cities, Los Angeles and San Diego, as well as three of the country's largest metropolitan areas. With a population of 3,792,621, Los Angeles is the most populous city in California and the second most populous in the United States. To the south and with a population of 1,307,402 is San Diego, the second most populous city in the state and the eighth most populous in the nation.' }, known_solutions=[toloka.task.BaseTask.KnownSolution(output_values={'is_correct': 'yes'})], message_on_unknown_solution='"With a population of 3,792,621, Los Angeles is the most populous city in California"', infinite_overlap=True, pool_id=verification_training.id )
] tasks_op = toloka_client.create_tasks_async(training_tasks)
toloka_client.wait_operation(tasks_op)

Verification pool

Now we need to create a pool with real tasks. We want to have big enough overlap to aggregate verdicts about every answer. We want to filter performers by their knowledge of English and the result on the training. Also, we want to ban performers who answer too fast and inaccurately solve captchas.

verification_pool = toloka.pool.Pool( project_id=verification_project.id, private_name='Pool 1', may_contain_adult_content=True, will_expire=datetime.datetime.utcnow() + datetime.timedelta(days=365), reward_per_assignment=0.01, auto_accept_solutions=True, assignment_max_duration_seconds=60*20, defaults=toloka.pool.Pool.Defaults( default_overlap_for_new_task_suites=5 ), filter=toloka.filter.Languages.in_('EN'),
) verification_pool.set_mixer_config(real_tasks_count=5, golden_tasks_count=0, training_tasks_count=0)
verification_pool.set_captcha_frequency('MEDIUM') # Ban performer who answers too fast
verification_pool.quality_control.add_action( collector=toloka.collectors.AssignmentSubmitTime(history_size=5, fast_submit_threshold_seconds=100), conditions=[toloka.conditions.FastSubmittedCount > 2], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration_unit=toloka.user_restriction.DurationUnit.PERMANENT, private_comment='Fast responses' )
) # Ban performer who answers too fast
verification_pool.quality_control.add_action( collector=toloka.collectors.AssignmentSubmitTime(fast_submit_threshold_seconds=45), conditions=[toloka.conditions.FastSubmittedCount > 0], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration_unit=toloka.user_restriction.DurationUnit.PERMANENT, private_comment='Fast responses' )
) # Ban performer by captcha criteria
verification_pool.quality_control.add_action( collector=toloka.collectors.Captcha(history_size=5), conditions=[toloka.conditions.FailRate >= 60], action=toloka.actions.RestrictionV2( scope=toloka.user_restriction.UserRestriction.PROJECT, duration=3, duration_unit=toloka.user_restriction.DurationUnit.DAYS, private_comment='Captcha' )
) verification_pool = toloka_client.create_pool(verification_pool)
print(f'Created pool with id {verification_pool.id}')
print(f'To view the training, go to: https://toloka.yandex.com/requester/project/{verification_project.id}/pool/{verification_pool.id}')

Running the pipeline

Let’s run a pipeline which will verify the answers and accept or reject assignments based on the results of the verification.

def wait_pool_for_close(pool): sleep_time = 60 pool = toloka_client.get_pool(pool.id) while not pool.is_closed(): print( f't{datetime.datetime.now().strftime("%H:%M:%S")}t' f'Pool {pool.id} has status {pool.status}.' ) time.sleep(sleep_time) pool = toloka_client.get_pool(pool.id) def prepare_verification_tasks(): verification_tasks = [] # Tasks that we will send for verification request = toloka.search_requests.AssignmentSearchRequest( status=toloka.assignment.Assignment.SUBMITTED, # Only take completed tasks that haven't been accepted or rejected pool_id=marking_pool.id, ) # Create and store new tasks for assignment in toloka_client.get_assignments(request): for task, solution in zip(assignment.tasks, assignment.solutions): verification_tasks.append( toloka.task.Task( input_values={ 'text': task.input_values['text'], 'question': task.input_values['question'], 'question_id': task.input_values['question_id'], 'is_possible': solution.output_values['is_possible'], 'answer': solution.output_values.get('answer', '').strip(), 'assignment_id': assignment.id, }, pool_id=verification_pool.id, ) ) print(f'Generate {len(verification_tasks)} new verification tasks') return verification_tasks def run_verification_pool(verification_tasks): verification_tasks_op = toloka_client.create_tasks_async( verification_tasks, toloka.task.CreateTasksParameters(allow_defaults=True) ) toloka_client.wait_operation(verification_tasks_op) verification_tasks_result = [task for task in toloka_client.get_tasks(pool_id=verification_pool.id) if not task.known_solutions] task_to_assignment = {} for task in verification_tasks_result: task_to_assignment[task.id] = task.input_values['assignment_id'] # Open the verification pool run_pool2_operation = toloka_client.open_pool(verification_pool.id) run_pool2_operation = toloka_client.wait_operation(run_pool2_operation) print(f'Verification pool status - {run_pool2_operation.status}') return task_to_assignment def get_aggregation_results(pool_id): print('Start aggregation in the verification pool') aggregation_operation = toloka_client.aggregate_solutions_by_pool( type='DAWID_SKENE', pool_id=pool_id, fields=[toloka.aggregation.PoolAggregatedSolutionRequest.Field(name='is_correct')] ) aggregation_operation = toloka_client.wait_operation(aggregation_operation) print('Results aggregated') return list(toloka_client.get_aggregated_solutions(aggregation_operation.id)) def set_answers_status(verification_results): print('Started adding results to marking tasks') assignment_results = dict() for r in verification_results: if r.task_id not in task_to_assignment: continue assignment_id = task_to_assignment[r.task_id] assignment_result = assignment_results.get(assignment_id, 0) # Increase the number of correct tasks in assignment if r.output_values['is_correct'] == 'yes': assignment_result += 1 assignment_results[assignment_id] = assignment_result for assignment_id, correct_num in assignment_results.items(): assignment = toloka_client.get_assignment(assignment_id) if assignment.status.value == 'SUBMITTED': # If 4 or 5 tasks in the assignment was marked as correct then we will accept the assignment if correct_num >= 4: toloka_client.accept_assignment(assignment_id, 'Well done!') else: toloka_client.reject_assignment(assignment_id, 'Incorrect answers') print('Finished adding results to marking tasks')
toloka_client.open_pool(marking_training.id)
toloka_client.open_pool(verification_training.id)
toloka_client.open_pool(marking_pool.id)
# Run the pipeline
while True: print('nWaiting for marking pool to close') wait_pool_for_close(marking_pool) print(f'Marking pool {marking_pool.id} is finally closed!') # Preparing tasks verification_tasks = prepare_verification_tasks() # Make sure all the tasks are done if not verification_tasks: print('All the tasks in our project are done') break # Add it to the pool and run the pool task_to_assignment = run_verification_pool(verification_tasks) print('nWaiting for verification pool to close') wait_pool_for_close(verification_pool) print(f'Verification pool {verification_pool.id} is finally closed!') # Aggregation operation verification_results = get_aggregation_results(verification_pool.id) # Reject or accept tasks in the segmentation pool set_answers_status(verification_results) print(f'Results received at {datetime.datetime.now()}')

Evaluate the results

Now, let’s evaluate the results. We have several different answers for every question so we need to aggregate them. Let’s select the final answer by majority vote between yes/no answer classification and pick shorter answers over longer ones.

request_for_result = toloka.search_requests.AssignmentSearchRequest( status=toloka.assignment.Assignment.ACCEPTED, pool_id=marking_pool.id, ) answers = dict()
for assignment in toloka_client.get_assignments(request_for_result): for i, sol in enumerate(assignment.solutions): answer = sol.output_values['answer'].strip() if sol.output_values['is_possible'] == 'yes' else '' current_list = answers.get(assignment.tasks[i].input_values['question_id'], []) current_list.append(answer) answers[assignment.tasks[i].input_values['question_id']] = current_list
final_answers = dict()
for key, value in answers.items(): sorted_value = sorted(value, key=lambda x: len(x)) n = len(sorted_value) // 2 if sorted_value[n] == '': final_answers[key] = '' else: final_answers[key] = next(filter(lambda x: x != '', sorted_value))
# Download evaluation script
!curl 'https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/' --output evaluate.py from evaluate import make_qid_to_has_ans, get_raw_scores, apply_no_ans_threshold, apply_no_ans_threshold, make_eval_dict, merge_eval # Implement `score` method using the methods from the evaluation script downloaded from the official SQUAD2.0 website
def score(dataset, preds): na_probs = {k: 0.0 for k in preds} qid_to_has_ans = {k: v for k, v in make_qid_to_has_ans(dataset).items() if k in preds} # Maps qid to True/False has_ans_qids = [k for k, v in qid_to_has_ans.items() if v] no_ans_qids = [k for k, v in qid_to_has_ans.items() if not v] exact_raw, f1_raw = get_raw_scores(dataset, preds) exact_thresh = apply_no_ans_threshold(exact_raw, na_probs, qid_to_has_ans, 1) f1_thresh = apply_no_ans_threshold(f1_raw, na_probs, qid_to_has_ans, 1) out_eval = make_eval_dict(exact_thresh, f1_thresh) if has_ans_qids: has_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=has_ans_qids) merge_eval(out_eval, has_ans_eval, 'HasAns') if no_ans_qids: no_ans_eval = make_eval_dict(exact_thresh, f1_thresh, qid_list=no_ans_qids) merge_eval(out_eval, no_ans_eval, 'NoAns') print(json.dumps(out_eval, indent=2))
score(data['data'], final_answers)

Conclusion

Even though this project is still a work in progress, we’re already seeing promising results and we’re certain that with incremental changes and improvements we can even beat SOTA models. So, if you have any ideas on how to improve this labeling project’s architecture, settings, instructions, or result aggregation methods, or if you have any other suggestions, feel free to leave a comment. 

References

Solving Q&A tasks with Toloka’s Python library and SQuAD2.0

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://hackernoon.com/when-did-beyonce-start-becoming-popular-tackling-one-of-the-most-common-problems-in-nlp-qa-s51t37ub?source=rss

Artificial Intelligence

Wealthech: Fabrick and Prometeia Partner on Wealth Management Solution Incorporating Open Banking, AI

Published

on

Fabrick, an Open Banking Fintech and Prometia, a company offering wealth management solutions, have joined to launch the Global Investment Portfolio, a digital wealth management solution that utilizes artificial intelligence (AI) as well as Open Banking tech.

According to a release, the two companies have pooled assets and skills in open banking and AI to develop the Global Investment Portfolio that puts together an investor’s overall financial portfolio through the aggregate analysis of the bank accounts held by them across various institutions. The Wealthtech leverages AI to spot information generated by asset management activities run by other banks without the need for direct access to all of an investor’s separate investment accounts.

Global Investment Portfolio uses Fabrick’s PSD2 Gateway that allows access to comprehensive bank data through the account aggregation service which provides analysis of all current accounts. The service provides a multi-bank experience that allows customers to view all information from a single touch point. The service is designed to allow investors to monitor all their investments from a single platform while providing real-time comparisons of investments and the ability to easily see which are performing and which are not.

Matteo Necci, a Partner at Prometeia, explained:

“Global Investment Portfolio is a cutting-edge solution with respect to the main trends in Digital Finance and is proposed as a distinctive element in the automation and digitisation of customer advisory processes. The combination of our know-how in artificial intelligence solutions for wealth management with Fabrick’s open banking expertise and ecosystem allows intermediaries to have in-depth knowledge of the investor’s financial portfolio, fully developing the potential of PSD2”.

Paolo Zaccardi, CEO of Fabrick, said that wealth management is a sector that is proving to be very active in exploiting the benefits of Open Finance to develop new digital services that meet the needs of the public and end consumers:

“Fabrick is an active part of this process and the partnership with Prometeia demonstrates how access to current account data represents only the tip of the iceberg of the numerous opportunities presented by our ecosystem and the collaborative approach we promote. You just have to look at the Global Investment Portfolio solution to understand the great value that the combination of account aggregation and data categorisation brings to all the players involved, tangibly enabling a new and more complete and personalised offer model.”

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://www.crowdfundinsider.com/2021/07/178142-wealthech-fabrick-and-prometeia-partner-on-wealth-management-solution-incorporating-open-banking-ai/

Continue Reading

AI

U.S. Issues Warning Advisory on Travel for the UK Over Rising COVID-19 Cases

Published

on

Americans should not travel to the United Kingdom – including England, Scotland, Wales, and Northern Ireland – because of the rise in Covid-19 cases caused by the virus in the Delta variant. Both the U.S. Centers for Disease Control and Prevention (CDC) and the U.S. Department of State have given the UK their very high warning levels.

Yesterday the CDC raised travel advisory in the United Kingdom to level 4, which means “the highest level of Covid-19.” They issued a notice that reads “If you must travel to the UK make sure that you are vaccinated”. “Due to the current situation in the United Kingdom, even fully vaccinated travelers may be at risk of receiving and distributing COVID-19,” the CDC notice said.

Covid-19 cases grew by more than 50,000 a day in the UK and hundreds of thousands of Britons were asked to go for self-isolation for ten days. In the U.K. the warning level previously was at level 3, indicating a “high” level of Covid-19 and warns that only fully vaccinated travelers should travel.

The U.S. Department of State raised its United Kingdom tourism warning to Level 4, which means “don’t visit the Uk.”The United Kingdom is currently recording an average of 65 new Covid-19 cases per 100,000 people, from the data issued by the Brown School of Public Health. That level of exposure puts the country “tipping point,” according to Brown’s Covid-19 risk assessment map.

U.S. Warnings were issued last Monday just after England abandoned the last of its epidemic restrictions and celebrated festive events to celebrate “Freedom Day”. This raised eyebrows for many countries including the US. However, Scotland, Wales, and Northern Ireland keep certain restrictions such as compulsory masks and social distances in public places.

Covid-19 is also rapidly spreading in the United States. Delta’s variant of Covid-19 exacerbates an increase in the number of deaths nationwide, say U.S. health officials. The United States currently records 12 new cases every day for every 100,000 people. The American epidemic epicenter is the state of Florida, currently recording 49.3 new cases a day out of 100,000. “This has become a pandemic for the uninitiated,” said Dr. Rochelle Walensky, director of the CDC.

Recommended Products

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://1reddrop.com/2021/07/21/u-s-issues-warning-advisory-on-travel-for-the-uk-over-rising-covid-19-cases/?utm_source=rss&utm_medium=rss&utm_campaign=u-s-issues-warning-advisory-on-travel-for-the-uk-over-rising-covid-19-cases

Continue Reading

AI

Praemium’s machine learning takes platform accuracy to a new level

Published

on

Praemium has expanded its machine learning and artificial intelligence capabilities to benefit users of its non-custodial Virtual Managed Account solution, by reducing human errors in data entry and improving data integrity.

Using machine learning across a range of data sets, Praemium has been able to identify transactions that may have been incorrectly entered or categorised by administrators.

Praemium’s Chief Technology Officer, Adam Pointon said, “When managing large volumes of data, human errors happen. Through machine learning we have been able to identify which transactions may be incorrect and predict the correct classification. For example, a buy transaction might be incorrectly entered as a withdrawal, or income as a deposit. These errors could provide incorrect portfolio performance information or have tax implications for investors.”

“This functionality allows for errors to be detected at scale and rectified quickly and is already being used successfully with several of Praemium’s institutional clients,” Pointon added.

The functionality expands upon Praemium’s existing machine learning capability Insights, launched in 2019, that is able to provide highly accurate predictive analytics that a client is demonstrating behaviours that indicate they are needing advice.

“Praemium’s non-custodial solution is recognised as the market-leader and we continue to enhance our technology with these exciting innovations.” Pointon continues.

Recent research undertaken by Praemium with Investment Trends showed that almost 60% of advisers are managing non-custodial client assets off-platform. Typically, these assets are managed manually via spreadsheets, consuming two extra hours of adviser resource per client.

Praemium’s Chief Commercial Officer Mat Walker also commented, “Praemium’s non-custodial solution has $140bn in assets under administration and offers advisers and wealth managers the benefit of managing both custodial and non-custodial assets on a single platform. Our research indicates that advisers are feeling the burden of administering these assets and our technology not only does this efficiently but also more accurately. We also offer the option to remove the administration burden completely by outsourcing to Praemium’s Administration Service who also utilise this functionality for large volume data processing.”

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://australianfintech.com.au/praemiums-machine-learning-takes-platform-accuracy-to-a-new-level/

Continue Reading

Artificial Intelligence

Will AI Developments Help Open Banking Take Off?

Published

on

Artificial intelligence has become a gamechanger in the banking industry in recent years. The global market for AI in Fintech was valued at nearly $8 billion last year. It is projected to be worth nearly $27 billion by 2026.

There are a number of reasons that AI is becoming an integral part of the banking industry. One reason is that it is driving process automation. However, AI is starting to show potential with even more complicated automation issues.

AI has made open banking possible. New advances in AI could help open banking become even more popular in the near future.

AI Drives the Future of Open Banking

Open banking is the technical process that allows financial providers to dip in and see the banking history and activity of a customer before they apply. It has been made possible through new developments in AI technology.

The process was recently introduced in the UK and many suggest that it could be the future of underwriting and eligibility for products such as credit cards, loans and mortgages. Antonio Tinto wrote an article about the evolution of open banking in the context of AI in fintech in his LinkedIn post Open Banking and AI – The Rise of Cognitive Banking.

Customers must agree for lenders to see their transactional history and financial information during the application process – but this should be able to provide lenders with a better understanding of the customer’s borrower spending, including highlighting any gambling or debt problems with machine learning algorithms. 

For lenders this offers a very insightful look into a customer’s spending habits and should provide much better decisions in terms of loan approvals, credit limits, loan amounts and more.

Budget planning programs and also fall under the umbrella of open banking. These machine learning programs compile data sourced from multiple locations such as credit cards and bank accounts, providing a full picture of spending habits. 

What Are the Benefits of Open Banking with AI?

Open banking gives lenders a better picture of a spender’s habits, allowing them to make an informed decision regarding potential loan and credit applications. Lenders use complex data-driven algorithms to make these analyses.


Currently, lenders rely heavily on customer credit scoring and other metrics including income checks and affordability checks, but for the average personal loan or credit card, there is no real delving into someone’s banking activity or machine learning analysis.

This allows lenders to find concrete information if there are recurring gambling issues, multiple loans taken out or huge overdrafts – something that typically goes unnoticed by lenders in basic checks.

Beyond this, lenders and credit providers can use these findings to improve their underwriting and build models to determine eligibility patterns – and thus approve better customers and increase their repayment rates.

What Are the Risks Associated with Online Banking?

Risks associated with online banking tend to include concerns about privacy policies and data protection. Financial data from various sources is merged in order to be analyzed in comparison with other datasets to create predictive algorithms. This can then forecast future spending habits.

This requires the access of private financial data, giving firms access to any transactions. Lenders are able to see any financial transactions taking place with customer consent, which could prevent them offering a loan.

The Difference Between Open Banking and Credit Scoring

Open banking can potentially offer more accurate reflections of a person’s financial situation and can also utilize existing credit scores to make decisions surrounding potential loans even stronger.

As open banking increases in popularity, different types of loans will be able to use it to provide lenders with clear insights of borrowers financial habits. Mortgages and other types of loans have the potential to operate in this manner, as open banking is adopted by more and more businesses.

Will Open Banking Take Off as AI Becomes More Widely Used in the Financial Sector?

AI technology has made open banking possible. Banking institutions are relying more heavily than ever on machine learning algorithms.

David Beard, founder of price comparison site, Lending Expert, commented:

“Open banking is certainly revolutionary and will definitely help lenders to better understand their applicants. Being able to see a customer’s bank statement history can highlight potential risks such as gambling debts or if they are starting with huge debts to begin with. This could help lenders steer clear of troubled customers or approve those that look more appealing.”

“The only challenge is that people have to opt into open banking, which not every customer will want to do – and ideally you need real volumes to make a difference to your bottom line and to build future models.”

“If lenders and providers can present this in a smart way that is data compliant and abides by regulation, open banking could be transformative.”

PlatoAi. Web3 Reimagined. Data Intelligence Amplified.
Click here to access.

Source: https://www.smartdatacollective.com/will-ai-developments-help-open-banking-take-off/

Continue Reading
Esports3 days ago

How to reduce lag and increase FPS in Pokémon Unite

Esports4 days ago

Coven skins for Ashe, Evelynn, Ahri, Malphite, Warwick, Cassiopeia revealed for League of Legends

Esports4 days ago

Will New World closed beta progress carry over to the game’s full release?

Aviation5 days ago

And Here’s Yet Another Image Of Russia’s New Fighter Concept That Will Be Officially Unveiled Tomorrow

Esports4 days ago

Can you sprint in New World?

Esports3 days ago

How to add friends and party up in New World

Esports3 days ago

How to claim New World Twitch drops

Esports5 days ago

How to complete FUTTIES Alessandrini’s objectives in FIFA 21 Ultimate Team

AR/VR3 days ago

Moth+Flame partners with US Air Force to launch Virtual Reality sexual assault prevention and response training

Esports3 days ago

Twitch streamer gets banned in New World after milking cow

Esports5 days ago

Everything we know about Seer in Apex Legends

Aerospace5 days ago

Boeing crew capsule mounted on Atlas 5 rocket for unpiloted test flight

Esports5 days ago

Evil Geniuses top laner Impact breaks all-time LCS early-game gold record in win over Dignitas

Esports5 days ago

What Time Does League of Legends Patch 11.15 Go Live?

Blockchain4 days ago

Rothschild Investment Purchases Grayscale Bitcoin and Ethereum Trusts Shares

Blockchain4 days ago

Uniswap (UNI) and AAVE Technical Analysis: What to Expect?

Esports4 days ago

Konami unveils Yu-Gi-Oh! Master Duel, a digital version of the Yu-Gi-Oh! TCG and OCG formats

Esports3 days ago

How to change or join a new world in New World

Esports4 days ago

Team BDS adds GatsH to VALORANT roster as sixth man before EU Stage 3 Challengers 2

Esports4 days ago

Overwatch League 2021 Grand Finals to be held in Los Angeles, playoff bracket in Dallas

Trending