MEFASH-WORLD is an acronym for MEN FASHION GENERATING SYSTEM. The whole concept of MEFASH-WORLD was born out of the need to solve a problem in the fashion industry. Am going to be walking you through the whole process of creating a system that can generate new efficient designs, yes, fashion designs for men.
Before we start building this system there are some basic concepts we need to know and the field of DATA SCIENCE AND ANALYTICS is a field that is vital in this development process.
Data Science is a multi-disciplinary field that brings together or combine concepts from three(3) major fields: Computer Science, Statistics/Machine Learning, and Data Analysis to understand and extract insights from the ever-increasing amount of data. One very interesting thing is that data is in abundance today and we have moved into the era of BIG DATA. Everyday we accumulate thousands of data, for instance, everyday you generate data, yes you, are you surprised how?. Ok, from the time you wake, take your bath, eat, the food you eat, the time you leave for work/school, all the activities you carry out within the day, these are all data ππ€.
PARADIGMS OF DATA RESEARCH:
There are actually two paradigms of data research, namely;
But for our MEFASH GENERATING SYSTEM, we will be making use of the Hypothesis-Driven approach and this is because we have identified a problem and we are to source for the dataset to be used in building our system. And the dataset is going to be images.
Which paradigm of Data Research do you think is the best? And why?
Leave your answers for me in the comment session below ππ.Bye π₯°
Hello guys, welcome back to another session on the how to create a FASHION
GENERATING SYSTEM. A quick
flashback to my first post (INTRODUCTION TO MEFASH-WORLD:
Here ),I said we
will be creating a system that can generate new efficient designs, yes, fashion designs for men. And
yesterday I stated that we will be using the HYPOTHESIS-DRIVEN paradigm of Data Research.
We will also be making use of Machine Learning algorithms too but before that we need to know the two main
types of machine learning algorithm, they are four actually.
But we are going to be making use of the UNSUPERVISED LEARNING approach for MEFASH
GENERATING SYSTEM.
We will be stopping here for today, I hope you gained a thing or two from this session.
You can leave your questions about anything you are not clear about or comment on in the comment
section
below π. Byeπ₯°
Hello guys π€ π, I warmly welcome you to another session on how to build a FASHION GENERATING SYSTEM, in this case, MEFASH GENERATING SYSTEM π. Sit tight as today on this episode we will be looking at how to identify our data, that is, the kind of data that will make up our dataset and how to prepare themπ.
In order for you to Identify the data to be used, you need to first identify the problem, in this case, our problem is how to create new fashion styles and we will be using machine learning to solve this problem. So if we are solving the problem of how to use machine learning to come up with a model that can generate new styles for customers, what do you think our dataset will look like? Now there are five types of data, that is, five forms in which our data can appear, namely:
But for our MEFASH GENERATING SYSTEM, we will be making use of the Hypothesis-Driven
approach and this is
because we have identified a problem and we are to source for the dataset to be used in building our
system. And the dataset is going to be images.To answer the question I asked earlier, we are going to be
making use of the UNSTRUCTURED DATA. This is because our dataset is going to be made up of images of
previous fashion styles which we will use to feed our model and train the model. This is getting
interesting
ππ€.
OK, that will be all for today guys, see you tomorrow for the next session ππ.
Which paradigm of Data Research do you think is the best? And why?
You can leave your questions about anything you are not clear about or comment on in the comment
section
below π. Byeπ₯°
Hello people, hope you are having a great day π?. We will be making progress on our
way to building our
MEFASH GENERATING SYSTEM. Since we already know we will be working with images as our dataset, we will be
talking about Data Preparation today.
The fact is, machine learning models depend on data, but before bringing your data into your machine
learning model it is important to make sure it is clean, accurate, and consistent. To create a machine
learning model, it is crucial that you are able to Train, Test and Validate the data before production.
Hence data preparation is used to produce Clean, Accurate, and Consistent data (CAC) that can be fed to your
machine learning model.
Once our data is identified and collected, data preparation is carried out to assess the condition of the
data which includes dropping incomplete records, outlier detection, random value filling, heuristic guess,
interpolation. But these techniques are mostly used for structured data, but since we are dealing with
unstructured data we will be looking at how to classify our data into Training data, Testing data, and
Validation Data.
In machine learning, training data is the data you feed to your machine learning algorithm for the purpose of training your model. Since we are using the supervised learning approach our training data is going to require human involvement to analyze and process for the computer to use, which means we are going to have to label the data. Training data is usually 80-90% of the dataset.
The test data is usually 10-20% of the dataset. It is a set of observations used to evaluate the effectiveness or performance of the model. The testing data are examples that the machine has not seen before, if it contains data that the machine has seen before or training data, it will be difficult to know or assess whether the machine has learned and this could lead to OVER-FITTING (which means memorizing the training data).
Validation data is required in some cases. It is used to tune variables called HYPERPARAMETERS which are responsible for controlling how the model is learned.
This is where we call it a wrap for today. I will discuss how to collect data(images)
in a very easy way
subsiquentlyππ.
You can leave your questions about anything you are not clear about or comment on in the comment section
below π. Byeπ₯°
There are several crawlers and frameworks for scrapping data on the World Wide
Web(www), but today we are
going to focus on ICRAWLER and we will be making use of the python programming language.
Good day and welcome to another exciting session of MEFASH GENERATING SYSTEM ππ. Today I will be teaching
you how to crawl or scrape data from the internet using ICRAWLER and Google Image Crawler. Don't worry, it's
very easy and I will be walking you through every stepπ.
A crawler is a computer program that automatically searches documents on the web. A web crawler is an internet bot that systematically browses the World Wide Web (www), typically operated by search engines like Google, Bing for the purpose of web indexing.
For you to be able to use ICRAWLER you will need the following on your computer system:
Install ICRAWLER You will need to open your Anaconda prompt and type the following pip install ICRAWLER or conda install -c hellock icrawler. Then you type and run the following lines of code on your jupyter notebook :
NOTE: In line 2, keyword is the type of data you want ICRAWLER to scrap for you. In line 6, storage is where you put the path to the folder where you want ICRAWLER to store the dataset for you.
Check Documentation for more
information on how to
use ICRAWLER.
OK, we stop here for today guys π€ π.
You can leave your questions about anything you are not clear about or comment on in the comment section
below π. Byeπ₯°
Data Scientists Network (DSN) is a registered non-profit oranization with a vision to build a world-class artifitial intelligence knowledge, research and innovation ecosystem that delivers high impact and transformational research, business use applications, AI-first start-ups, employability and social good use cases; such that in 10 years, they will raise 1 million AI tallents and build AI solutions that improves the quality of life and wellbeing of 2 billion people in emergiing market. This vision is avhieved through renewed focus on 3 core areas;
I partnered with DSN as a facilitator in the 2022 AI Invasion, an annual 6 weeks free AI class and training organized by DSN. This training features introductory Python, Data Science and Machine Learning.
Leave a comment