Examiner Monitor#

This notebook shows how patent_client can be used to examine what is on a particular examiner’s docket, and some simple statistics for a particular examiner.

The input to this will just be an application that we know the examiner is handling - like for example a case we just got an Office Action from.

[1]:
app_no = '16538065'

Next, we’ll do our standard imports, query the USApplication object for that application, and retreive the examiner’s name

[2]:
import pandas as pd
import numpy as np
from patent_client import USApplication

app = USApplication.objects.get(app_no)
print(f'The examiner for {app.appl_id} is {app.app_exam_name} of art unit {app.app_grp_art_number}')
The examiner for 16538065 is TRAN, ANHTAI V of art unit 2168

Then, we’ll go and retreive all the cases that this examiner has ever handled by searching USApplication for all applications listing the same examiner, and then populate a subset of that data into a Pandas DataFrame.

NOTE: Because the USApplication object is lazy, this might take a minute - patent_client is now retrieving the application data

[3]:
exam = USApplication.objects.filter(app_exam_name=app.app_exam_name)
exam_df = (exam.values('appl_id', 'app_filing_date', 'patent_title', 'patent_number', 'app_status', 'app_status_date')
    .to_pandas()
)
exam_df.head()
[3]:
appl_id app_filing_date patent_title patent_number app_status app_status_date
0 17077511 2020-10-22 HANDLING OUT-OF-ORDER DATA DURING STREAM PROCE... None Publications -- Issue Fee Payment Verified 2022-08-11
1 16648169 2020-03-17 PARSING REGULAR EXPRESSIONS WITH SPIKING NEURA... None Non Final Action Mailed 2022-08-10
2 16932252 2020-07-17 LOGICAL PARTITIONS VIA HEADER-BASED PARTITION ... None Notice of Allowance Mailed -- Application Rece... 2022-08-09
3 17038351 2020-09-30 SYSTEM AND METHOD FOR BLOOM FILTERS IN LARGE S... None Notice of Allowance Mailed -- Application Rece... 2022-08-08
4 16538065 2019-08-12 SYSTEMS AND METHODS FOR ANALYZING THE VALIDITY... None Advisory Action Mailed 2022-08-08

Simple Lookups#

Now that we have all the Examiner’s cases, we want to take a peek at what the Examiner is working on. To do that, we’ll make a list of application status names that we’re interested in - those that indicate that the Examiner is still working on the case and the case isn’t abandoned, allowed, or awaiting a first office action

[4]:
active_pros_statuses = [
    'Advisory Action Mailed',
    'Board of Appeals Decision Rendered',
    'Non Final Action Mailed',
    'Response after Final Action Forwarded to Examiner',
    'Response to Non-Final Office Action Entered and Forwarded to Examiner',
    'Final Rejection Mailed',
]

active_df = exam_df[exam_df['app_status'].isin(active_pros_statuses)]
active_df.head()
[4]:
appl_id app_filing_date patent_title patent_number app_status app_status_date
1 16648169 2020-03-17 PARSING REGULAR EXPRESSIONS WITH SPIKING NEURA... None Non Final Action Mailed 2022-08-10
4 16538065 2019-08-12 SYSTEMS AND METHODS FOR ANALYZING THE VALIDITY... None Advisory Action Mailed 2022-08-08
5 16985917 2020-08-05 SYSTEM FOR AUTOMATED MATERIAL MASTER DATA HARM... None Response to Non-Final Office Action Entered an... 2022-08-05
6 16796106 2020-02-20 SYSTEMS AND METHODS FOR PREDICTING WHERE CONVE... None Final Rejection Mailed 2022-08-04
7 17158895 2021-01-26 DECLARATIVE SEGMENT ATTRIBUTE ACTIVATION None Final Rejection Mailed 2022-08-04

Maybe we’re trying to predict when an Examiner might get to a particular case. To do that, we can use a Pandas filter to select all the cases that have the status “Docketed New Case - Ready for Examination”

[5]:
docket_df = exam_df[exam_df['app_status'] == 'Docketed New Case - Ready for Examination']
docket_df.head()
[5]:
appl_id app_filing_date patent_title patent_number app_status app_status_date
132 17362419 2021-06-29 METHODS AND APPARATUS TO ESTIMATE CARDINALITY ... None Docketed New Case - Ready for Examination 2021-08-24

Then we can use a simple function to count how many cases are ahead of ours, first by sorting by filing date, then counting how many cases down our application is:

[6]:
docket_df = docket_df.sort_values('app_filing_date', ascending=True)

counter = 0

for i, row in docket_df.iterrows():
    if row['appl_id'] == app_no:
        break
    counter += 1

print(f"There are {counter} cases ahead of {app.appl_id} on Examiner {app.app_exam_name}'s Docket")
There are 1 cases ahead of 16538065 on Examiner TRAN, ANHTAI V's Docket

Statistical Analysis#

Maybe we want some more detailed information. Things like average pendency, overall case load, etc. Let’s do some of those statistics now.

First, we’ll take a look at the examiner’s entire docket, bucketed by years, to get a general sense for the Examiner’s workflow and case dispositions. To start, we’ll add a column that will classify a case as either “Patented”, “Abandoned”, or “Pending”. To do that, we’ll use the Pandas “apply” function. If the case has a patent number, we’ll categorize it as “Patented.” This is better than looking at status, because the case might have lapsed for failure to pay maintenace fees. Then we’ll look at the status and see if the word “abandoned” appears in it. If so, we can be reasonably sure the case has been abandoned. If it doesn’t fit in either category, then the case is still pending.

[7]:
def final_disposition(row):
    row['year'] = row['app_filing_date'].year
    if row['patent_number']:
        row['final'] = 'Patented'
    elif 'abandon' in str(row['app_status']).lower():
        row['final'] = 'Abandoned'
    else:
        row['final'] = 'Pending'
    return row

final_df = exam_df[~exam_df['appl_id'].str.contains('PCT')].apply(final_disposition, axis=1)

final_df.head()
[7]:
appl_id app_filing_date patent_title patent_number app_status app_status_date year final
0 17077511 2020-10-22 HANDLING OUT-OF-ORDER DATA DURING STREAM PROCE... None Publications -- Issue Fee Payment Verified 2022-08-11 2020 Pending
1 16648169 2020-03-17 PARSING REGULAR EXPRESSIONS WITH SPIKING NEURA... None Non Final Action Mailed 2022-08-10 2020 Pending
2 16932252 2020-07-17 LOGICAL PARTITIONS VIA HEADER-BASED PARTITION ... None Notice of Allowance Mailed -- Application Rece... 2022-08-09 2020 Pending
3 17038351 2020-09-30 SYSTEM AND METHOD FOR BLOOM FILTERS IN LARGE S... None Notice of Allowance Mailed -- Application Rece... 2022-08-08 2020 Pending
4 16538065 2019-08-12 SYSTEMS AND METHODS FOR ANALYZING THE VALIDITY... None Advisory Action Mailed 2022-08-08 2019 Pending

Now we need to transform the Pandas dataframe into something we can plot. What we would like to have is a dataframe that looks like this:

Year

Pending

Patented

Expired

2015

13

12

1

2016

14

15

15

To do that, we’re going to need to do an aggregation by year

[8]:
%matplotlib inline

def patented(series):
    return len(list(filter(lambda x: x == '')))

# Group by year, and by final, then create a count of each
result = final_df.groupby(['year', 'final']).agg({'final': 'count'}).rename(columns={'final': 'count'})
result = result.reset_index() # Convert indexes into columns
result = result.pivot(index='year', columns='final', values='count')

result.head()
[8]:
final Abandoned Patented Pending
year
2003 NaN 1.0 NaN
2004 NaN 3.0 NaN
2005 11.0 20.0 NaN
2006 17.0 28.0 NaN
2007 18.0 21.0 NaN

And now we plot it!

[9]:
result.plot.bar(figsize=(10,5))
[9]:
<AxesSubplot:xlabel='year'>
../_images/examples_4_-_Examiner_Monitor_17_1.png

Average Time to First Action#

Maybe we’d like to know how long the Examiner takes from docketing until a first Office Action. To do that, we’ll use the transaction information stored on each case. For example, from our original app, the transactions look like this:

[10]:
app.transactions.to_pandas().head()
[10]:
code date description
0 BRCE 2022-08-10 Workflow - Request for RCE - Begin
1 EML_NTR 2022-08-10 Email Notification
2 MCTAV 2022-08-10 Mail Advisory Action (PTOL - 303)
3 CTAV 2022-08-05 Advisory Action (PTOL-303)
4 FWDX 2022-08-05 Date Forwarded to Examiner

First, we need to know all the possible transaction codes. To do that, we can take every case in the examiner’s stack, and create a set of code-description tuples

[11]:
codes_df = (exam
            .values("transactions")
            .explode("transactions")
            .unpack("transactions")
            .values("transactions.code", "transactions.description")
            .to_pandas()
            .drop_duplicates()
           )
codes_df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1 entries, 0 to 0
Data columns (total 2 columns):
 #   Column                    Non-Null Count  Dtype
---  ------                    --------------  -----
 0   transactions.code         0 non-null      object
 1   transactions.description  0 non-null      object
dtypes: object(2)
memory usage: 24.0+ bytes
[12]:
codes_df.head()
[12]:
transactions.code transactions.description
0 None None

Obviously there are entirely too many possible codes / descriptions to easily pick through in a notebook, so we’ll export the spreadsheet to excel for review

[13]:
codes_df.to_excel('codes.xlsx')

After review, it looks like all the relevant codes for a first action by the examiner begin one one of three prefixes - MFA, CT, or MC. And the case is initially docketed with a code ‘DOCK’. So let’s use this to calculate a time to first office action for each case. First, we’ll make a dataframe that has every transaction, with the associated application id

[14]:
app_df = pd.DataFrame.from_records({'appl_id': a.appl_id, 'obj': a} for a in examiners_applications if 'PCT' not in a.appl_id)
app_df.head()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 app_df = pd.DataFrame.from_records({'appl_id': a.appl_id, 'obj': a} for a in examiners_applications if 'PCT' not in a.appl_id)
      2 app_df.head()

NameError: name 'examiners_applications' is not defined
[ ]:
transactions_df = (exam
                   .values("appl_id", "transactions")
                   .explode("transactions")
                   .unpack("transactions", prefix=False)
                   .to_pandas()
                  )
transactions_df.head()

Now we’ll write a function that takes an application, and returns the time to first Examiner’s action, measured as days from DOCK to an action by the examiner

[ ]:
import re
from datetime import datetime
import warnings

first_action_re='^(MFA|CT[^SH]|MC)'

def time_to_action(group):
    transactions = group.sort_values("date")
    docketing_date = transactions.query("code == 'DOCK'").iloc[0].date
    try:
        first_action = transactions[transactions.code.str.contains(first_action_re)].iloc[0]
    except IndexError:
        return pd.Series()
    row = pd.Series()
    row['first_action'] = first_action.description
    row['first_action_code'] = first_action.code
    row['first_action_date'] = first_action.date
    row['docketed_date'] = docketing_date

    row['time_to_action'] = (
        first_action.date
        - docketing_date
    ).days
    return row


with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    result_df = transactions_df.groupby("appl_id").apply(time_to_action).unstack()

result_df.head(20)

And now we can look at the aggregate stats for time to first office action:

[ ]:
result_df.time_to_action.describe()

We can also look at the average time by what the response type is:

[ ]:
result_df.groupby('first_action').describe()