From Data Collection to Model Deployment: A Step-by-Step Guide in Machine Learning
1 Collect Data => To train we need data so we collect data from different sources like Excel, SQL, Web Scraping or many other sources... There are many sources like Kaggle or Stats, these are websites From where we can download the datasets, most of the datasets are available in CSV (Comma Separated Values), which is the Excel format that is most used in ML models. 2) Data Cleaning => Whenever we get data it is available in raw format which is not directly usable in ML purpose, it means some data is missing or there is null in data or some in wrong format. Is. Is. Are there any or some duplicates and much more. ,This type of data is called messy data. If we use it directly in ML models we may get wrong or binary predictions, confusing insights or low accuracy, so it is extremely important that we avoid using it in ML. First of all the data should be cleaned, for this we have many tools like pandas, sklearn. In Python, cloud options and many others but in initial ML journey pandas...