| --- |
| license: mit |
| datasets: |
| - bigbio/chemdner |
| - ncbi_disease |
| - jnlpba |
| - bigbio/n2c2_2018_track2 |
| - bigbio/bc5cdr |
| language: |
| - en |
| metrics: |
| - precision |
| - recall |
| - f1 |
| pipeline_tag: token-classification |
| tags: |
| - token-classification |
| - biology |
| - medical |
| - zero-shot |
| - few-shot |
| library_name: transformers |
| --- |
| # Zero and few shot NER for biomedical texts |
|
|
| ## Model description |
| Model takes as input two strings. String1 is NER label. String1 must be phrase for entity. String2 is short text where String1 is searched for semantically. |
| model outputs list of zeros and ones corresponding to the occurance of Named Entity and corresponing to the tokens(tokens given by transformer tokenizer) of the Sring2. |
|
|
| ## Example of usage |
| ``` |
| from transformers import AutoTokenizer |
| from transformers import BertForTokenClassification |
| |
| modelname = 'ProdicusII/ZeroShotBioNER' # modelpath |
| tokenizer = AutoTokenizer.from_pretrained(modelname) ## loading the tokenizer of that model |
| string1 = 'Drug' |
| string2 = 'No recent antibiotics or other nephrotoxins, and no symptoms of UTI with benign UA.' |
| encodings = tokenizer(string1, string2, is_split_into_words=False, |
| padding=True, truncation=True, add_special_tokens=True, return_offsets_mapping=False, |
| max_length=512, return_tensors='pt') |
| |
| model = BertForTokenClassification.from_pretrained(modelname, num_labels=2) |
| prediction_logits = model(**encodings) |
| print(prediction_logits) |
| ``` |
|
|
| ## Code availibility |
|
|
| Code used for training and testing the model is available at https://github.com/br-ai-ns-institute/Zero-ShotNER |
|
|
| ## Citation |