Data Science & ML
Predictive modeling, feature engineering, classification, regression, and performance evaluation.
scikit-learn • XGBoost • PyTorchOpen for Full Stack Data Analytics,Data Science, Applied ML-NLP & LLMs roles
I build machine learning, NLP, generative AI, and full-stack data analytics systems that turn complex data into actionable insight for healthcare, finance, language technology, and decision-making.
// 01_about
I am a Full Stack Data Analyst and ML+ NLP Specialist focused full data and AI pipeline, from raw data extraction to analytics delivery, machine learning modeling, and NLP research. My strength is connecting data engineering, business intelligence, machine learning, and language AI into practical systems that support decision-making and real-world applications. My work combines model development, data pipelines, dashboards, and stakeholder-facing insight delivery.
I have worked across research, healthcare and finance analytics, data engineering, and Business Intelligence (BI) development and reporting. I enjoy building systems that connect raw data, machine learning models, and clear decision-making outputs.
profile.json
{
"name": "Victor Owino",
"role": "Data Scientist & Full-stack Data Analyst",
"location": ["New York City, NY", "Kirkland, WA"],
"focus": [
"NLP and LLMs", "Generative AI", "Healthcare Data Science",
"Machine Learning", "Full Stack Data Analytics"
],
"tools": ["Python", "SQL", "PyTorch", "Power BI", "Tableau", " open-source Hugging Facetransformers"],
"availability": "Open to full-time/part-time roles"
}
// 02_focus
Predictive modeling, feature engineering, classification, regression, and performance evaluation.
scikit-learn • XGBoost • PyTorchText classification, NER, corpus annotation, multilingual analysis, and language technology evaluation.
BERT • spaCy • Hugging FacePrompt engineering, structured reasoning, LLM benchmarking, and semantic similarity scoring.
LLMs • PIC • SBERTExecutive dashboards, KPI reporting, data storytelling, and operational decision support.
Power BI • Tableau • SQLClinical NLP, adverse event detection, EHR modeling, and healthcare analytics workflows.
ClinicalBERT • MIMIC-IV • EHRData pipelines, model workflows, API-driven prototypes, dashboards, and deployment-ready interfaces.
Python • REST APIs • Git// 03_research
Clinical adaptation of structured prompting to help LLMs identify medication-harm links in adverse drug event detection from clinical notes.
$ run_experiment --task ade_detection --method PIC
status: conference submission
focus: structured reasoning + biomedical NLP
Deep learning approaches for predicting blood transfusion adverse events using patient-record data and healthcare modeling workflows.
$ train_model --domain healthcare --risk adverse_events
output: patient-level prediction workflow
Swahili annotation and cross-lingual benchmark development for evaluating LLM understanding of euphemisms, pragmatic meaning, and translation quality.
$ evaluate --language swahili --task euphemism_translation
metric: exact match + semantic similarity
// 04_projects
Fine-tuned BERT, BioBERT, and ClinicalBERT models to extract Conditions, Drugs, and Procedures from clinical text using BIO tagging.
Built analytical datasets and engineered financial risk features for loan default prediction and portfolio monitoring.
Cleaned, transformed, and enriched claims data to monitor suspicious patterns in procedures, diagnoses, and billing amounts.
Designed structured prompting workflows for adverse drug event classification, comparing PIC strategies with zero-shot and few-shot baselines.
Built evaluation workflows for testing whether LLMs preserve implicit and euphemistic meaning across languages and contexts.
Supported public health research by cleaning data, engineering features, creating analysis-ready datasets, and co-developing statistical/ML models.
// 05_experience
09/2025 — 05/2026
09/2023 — 12/2024
10/2021 — 08/2023
01/2021 — 09/2021
// 06_skills
Python · R · SQL · Bash
scikit-learn · XGBoost · TensorFlow · PyTorch · feature engineering · classification · regression
Hugging Face · BERT · BioBERT · ClinicalBERT · spaCy · NLTK · prompt engineering · LLM evaluation
ETL/ELT · workflow orchestration · data lakes · data warehousing · REST APIs · JSON · CSV · Parquet
PostgreSQL · MySQL · Google BigQuery · Neo4j · AWS · Azure · GCP
Power BI · Tableau · Excel · dashboards · KPI reporting · statistical analysis
// 07_education_certifications
Expected 05/2026
Relevant coursework: Machine Learning, Deep Learning, Natural Language Processing, Special Topics in Generative AI.
2021
Relevant coursework: Database Management Systems, Object-Oriented Programming, Data Structures and Algorithms.
Certifications
// 08_contact
Interested in discussing a data science, NLP, analytics, machine learning, or applied AI role? Let’s connect.
$ contact --email
victorowinoke@gmail.com
$ location
New York City, NY • Kirkland, WA