This article was published as a part of the Data Science Blogathon. Introduction Data Augmentation (DA) Technique is a process that enables us to artificially increase training data size by generating different versions of real datasets without actually collecting the data. The data needs to be changed to preserve the class categories for better performance in […]
This article was published as a part of the Data Science Blogathon. Introduction In the first part of the series, we saw some most common techniques which we daily use while cleaning the data i.e. text cleaning in NLP. I would recommend if you haven’t read it first read it, which will help you in […]
Feature Engineering on text data using Natural Language Processing Techniques. This article focuses primarily on text data feature engineering. Within the same process, we will be going over the following techniques and processes: Lemmatization / Stemming Count Vectorizer One Hot Encoding Train Test Split Principal Component Analysis Some general text cleaning and null value imputation […]
This article was published as a part of the Data Science Blogathon. Introduction In any machine learning task or data analysis task the first and foremost step is to clean and process the data. Cleaning is important for model building. Well, cleaning of data depends on the type of data and if the data is textual […]