About me

I'm Data Scientist and Machine Learning Engineer from Cleveland, Ohio, working in Cleveland Clinic and veteran Affairs Hospital As Data scientist and Signal processing Consultant. I enjoy turning complex problems into simple, beautiful and intuitive visual graphs.

As a data scientist and machine learning engineer, I am passionate about using data to solve complex problems and create innovative solutions. With a strong foundation in statistics, mathematics, and computer science,I have experience in developing and implementing machine learning algorithms, statistical models, and predictive analytics to extract insights and make data-driven decisions.

I am skilled in programming languages such as Python, R, and SQL, and have experience with popular machine learning frameworks such as TensorFlow and Scikit-learn. Additionally, I have experience working with big data technologies such as Hadoop and Spark, as well as experience in data visualization tools such as Tableau and Power BI. With a strong analytical mindset, attention to detail, and a dedication to continuous learning, I am committed to delivering high-quality solutions that drive business impact.

What i'm doing

  • design icon

    Algorithms and data structures

    The most modern and high-quality design made at a professional level.

  • Web development icon

    Data visualization

    High-quality development of sites at the professional level.

  • mobile app icon

    Cloud computing

    Professional development of applications for iOS and Android.

  • camera icon

    Quantitative analysis

    I make high-quality photos of any category at a professional level.

  • camera icon

    Statistical modeling

    I make high-quality photos of any category at a professional level.

  • camera icon

    Full Stack Development

    I make high-quality photos of any category at a professional level.

  • camera icon

    Data Science Mentor

    I make high-quality photos of any category at a professional level.

  • camera icon

    Research Scientist

    I make high-quality photos of any category at a professional level.

Testimonials

  • Daniel lewis

    Daniel lewis

    Varun Reddy Gopu is very good data scientist

  • Jessica miller

    Jessica miller

    varun is great machine learning engineer

  • Emily evans

    Emily evans

    varun is great at AI and ML and did lot of projects for healtcare sector.

  • Henry william

    Henry william

    Varun is skillfull data scientist.

Clients

Resume

Education

  1. Case western Reserve University

    2022 — 2023

    As a highly motivated graduate student pursuing a Master's degree in Biomedical Engineering with a specialization in Data Science, I have developed a strong foundation in a diverse range of subjects, preparing me for a career at the intersection of engineering, medicine, and data analytics.

    My academic coursework has included essential subjects such as Introduction to Biomedical Engineering, Biostatistics, and Data Structures and Algorithms, providing me with a robust understanding of the fundamentals of biomedical engineering and data analysis.

    Furthermore, I have pursued advanced coursework in Signals and Systems, Machine Learning, Medical Imaging, Data Visualization and Communication, and Deep Learning, equipping me with cutting-edge techniques and tools for analyzing and interpreting complex biomedical data.

    Through my academic experience, I have gained hands-on experience working with various programming languages, software tools, and techniques used in the field, and have completed a research project focused on the development of a data-driven approach for improving medical diagnosis.

    Overall, my Master's degree in Biomedical Engineering with a specialization in Data Science has equipped me with the knowledge, skills, and experience to excel in a variety of roles in the biomedical industry and beyond.

  2. Kamala Institute of technology and sciences

    2011 — 2015

    Proficient in programming languages such as C, C++, Python, and MATLAB for data analysis, simulation, and control systems design

    Skilled in using electronic design automation (EDA) software such as Altium Designer and Cadence to design and simulate electronic circuits and systems

    Experienced in using various electronic measuring instruments such as oscilloscopes, spectrum analyzers, and logic analyzers for circuit testing and troubleshooting

    Knowledgeable in telecommunications systems, including signal processing, modulation, and wireless communication protocols such as LTE, Wi-Fi, and Bluetooth

    Strong understanding of digital signal processing (DSP) techniques such as filtering, transformation, and modulation for data compression, image processing, and speech analysis

    Familiarity with microcontroller-based systems design and programming using popular microcontrollers such as Arduino and Raspberry Pi

    Proficient in technical writing and documentation, including creating technical reports, diagrams, and user manuals

    Strong analytical and problem-solving skills with attention to detail, able to troubleshoot complex problems, and propose innovative solutions

    Ability to work in a team environment, effectively communicate ideas, and collaborate with colleagues to meet project goals and timelines.

  3. Narayana Junior College

    2009 — 2011

    Did specialization in

  4. Siddartha High School

    2004 — 2009

    Duis aute irure dolor in reprehenderit in voluptate, quila voluptas mag odit aut fugit, sed consequuntur magni dolores eos.

Experience

  1. Data Scientist at Cleveland Clinic

    2023 — Present

    Developed, enhanced, maintained Gradient Boosting Classifier in predicting paediatrics patients with Amblyopia with an accuracy of 80% and pipelined the data using Joblib.

    Performed techniques such as regularization, hyperparameter tuning, and adding more features to avoid overfitting and underfitting.

    Designed and developed User Interface (UI) to analyze and generate reports through an API endpoint that accepts input data and returns the model's predictions. This is useful when deploying the model to production and integrating it into an application or a website.

  2. Data Scientist at Veterans Affairs/University Hospital

    2023 — Present

    Using machine learning tools such as decision trees, logistic regression, and neural networks, I collaborated with a doctor to develop a predictive model for the BARDA Challenge aimed at identifying the risk of hospitalization in patients with Oculopalatal tremor explained by a model of inferior olivary hypertrophy and cerebellar plasticity.

    Collaborated with biostatisticians using multiple linear regression and linear mixed effects models in analyzing in-vivo animal data, utilizing tools such as R and SAS, to identify a promising insulin analog for clinical trials.

  3. R&D Data Scientist at Case Western Reserve University/Case school of medicine

    2022-2023

    Developed an NLP pipeline using tools like spaCy and scikit-learn to perform entity and relation extraction at scale on PubMed and Prostate cancer dataset, later used to automate various processes within MDR and Marketing team.

    Utilized Python with tools such as scikit-learn, lifelines, and pandas to design and implement statistical learning, time-to-event, and predictive survival models on real-world evidence (RWE) data sourced from 17M patients’ records.

    Developed a gene-based recommendation model for prostate cancer patients using machine learning techniques and gene analysis, providing treatment insights, with tools including Python and relevant libraries.

  4. Data Scientist at Capital Now

    2022-2023

    Developed a Gradient Boosting Classifier ML model to accurately classify defaulters from a pool of 200,000 customers, identifying 5,000 defaulters with high precision, and designed and implemented a recommendation system to predict future defaulters.

    Built Pipelines for data preprocessing using sklearn, feature-engine & featuretools; feature selection/reduction using SelectFromModel, PCA; regression modeling using GBM, LightGBM, XGBoost, Random Forest, CatBoost, Ridge, Lasso and Elastic Net; hyperparameter tuning using GridSearchCV, RandomizedSearchCV on Azure Databricks in Python for predicting productivity.

    Developed an AI model for customer identity verification using Natural Language Processing (NLP) libraries like spaCy and NLTK, Computer Vision libraries like OpenCV and TensorFlow, and machine learning frameworks like Scikit-learn and Keras. The model includes text segmentation and extraction techniques that are customized for ID cards, driving licenses, and passports from over 100 countries worldwide. The pipeline time for data scraping has been reduced from one year to less than one month by utilizing optimization techniques such as Gradient Descent and multi-processing libraries like Multiprocessing and Dask.

    Utilized Computer Vision libraries such as OpenCV and TensorFlow to employ multi-instance segmentation methods for detecting spoofed and tampered documents.

    Built 9 Tableau dashboards with 74 views to analyze Annuities business which is deployed to production, successfully restarted policies worth $84 million with the aid of the dashboard.

    Utilized libraries such as Scikit-learn, TensorFlow, and PyTorch to conduct advanced analytics, leveraging machine learning algorithms, predictive modeling techniques, and optimization methods to deliver insights into credit risk and develop analytical solutions that help to achieve business objectives.

    Owned and managed dynamic Reject Inference (RI)/Loan Performance data tables (100 million+ rows) from various credit bureaus; Invented and implemented data quality checking, data processing, and updating pipelines using Python libraries such as Pandas, NumPy, and Dask, resulting in more efficient and accurate data analysis.

    Developed a deep learning model using face recognition libraries like OpenFace and dlib, which replaced the existing API for face verification of customers and resulted in a cost reduction of 15% for the company.

    Optimized the existing model using libraries such as TensorFlow and PyTorch, which led to a 25% reduction in the company's core production pipeline time.

  5. Data Scientist/ Machine Learning Engineer at Stratycon Technologies PVT LTD

    2015-2022

    Managed the Data Science team, designed the product pipeline, and transformed VoIP services through AI to reduce the need for human intervention, providing benefits to businesses with large call volumes.

    Performed in-depth analyses of structured and unstructured data using advanced statistical techniques and strong knowledge of algorithms, resulting in the development of data reports, performance metrics, and strategic objectives to address business partner objectives in CVS.

    Performed exploratory data analysis using pandas-profiling and K-Means clustering on NCES data for new business prospects to expand Annuities business.

    Utilized AI/ML and cloud capabilities to build innovative solutions for complex business problems, resulting in improved analytical scenarios and potential future outcomes for 1-Pharma company.

    Developed an algorithm using OpenCV, YOLO and TensorFlow for license plate detection which yielded an accuracy of 90%.

    Designed a system for 70+ branches of Dominos in the USA, including an algorithm to mask credit card information of customers from call recordings, a sentiment analysis system, an algorithm to detect up-selling, and a pipeline that allows AI-based systems to accept voice or text instructions and perform tasks accordingly.

    Performed data wrangling, manipulation and visualization on 3 million rows of healthcare data for descriptive statistical analysis with R, Python, Tableau and Excel.

    As part of my responsibilities, I built an AI/ML enabled Cyber Security product that uses a hybrid neural network (a combination of supervised and unsupervised learning) to detect, handle, and remediate ransomware attacks, while also designing a machine learning pipeline to perform database interactions, data pre-processing, and track model behaviour.

My skills

  • Statistical Modeling
    95%
  • Data Visualization: Seaborn, Plotly, Tableau, Matplotlib, Power BI, Qlickview, Communication Skills, Looker, Alteryx, Google Analytics
    95%
  • Data Wrangling
    95%
  • Machine Learning Algorithms
    95%
  • NLP, Computer Vision, and Image Segmentation
    95%
  • Big Data Frameworks and Cloud Services
    95%
  • Programming languages: Python, R, Java, Scala, SQL, Java, C++, C
    95%
  • Database Management and SQL
    95%
  • Deep Learning Frameworks: TensorFlow, Keras, PyTorch, Caffe, MXNet, Convolutional Neural Networks, Long Short-Term Memory
    95%
  • Data Engineering Tools: Apache Airflow, Luigi, AWS Glue, Google Cloud Dataflow
    95%
  • Web Frameworks and Database/ORM Tools: NodeJs, HTML, CSS, JS, NextJs, MongoDB, Flask, Django, FastAPI
    95%
  • Cloud Services: AWS, AZURE, Snowflakes
    90%
  • DataWarehousing: snowflakes, redshift, Teradata
    80%
  • MLOps: Model versioning, Model training & experimentation, Model deployment, Model validation, Model monitoring, Model retraining, Model registry, Feature engineering, Model serving, Model governance, ML pipeline automation
    80%

Portfolio

Blog

Contact

Contact Form