RPD Extraction


  • This scenario provides a way to easily query and produce sub-corpora from the RISIS Patent Database. The scenario is designed to allow to search through the millions of patents with ease, giving the ability to create customized sub-corpora. Once the data is extracted it can be analyzed using other scenarios or be downloaded and used with other tools.


  1. Login into https://rcf.risis.io.

  2. Create a project.

  3. Import the scenario ‘RPD Extraction’ into the project.

  4. Configure the project filling the inputs: rpd-extraction-scenario.png

    • Search criteria: Use the query factory to customize your search criteria. For this example, we wanted the earliest_filing_year to be greater than '2012'.
    • Maximal number of records to return: Use this field to constrains the number of records returned. For this example, we set the bar at '1000'.
    • Dataset title: Enter the title for the new dataset. For this example, we selected 'My Patent Dataset'.
    • User e-mail: Enter user's email to share the extraction with.
  5. When all the required fields are filled, you can click on the button in the top right ‘Run this scenario’.

  6. After the scenario execution is finished, check in the Outputs of the project for the new file produced: Patents Basic Info Extraction Result, Patent Technical Classification Info Extraction Result, Patent Actors Extraction Result, Patent Inventors Info Extraction Result and Patents Full Info Extraction Result.