Unit 2: AI Project Cycle
Question and AnswerQuestion 1: Explain the stages of AI project cycle.
Answer: The stages of AI project cycle are:
Stage 1: Planning
• Problem scoping
Problem scoping is the stage that begins after identifying the problem in developing the AI projects. This is where we identify the project and set goals that we want our AI project to achieve. Problem Scoping can be considered as a part of the problem-definition phase. The AI project begins with the defining of problem, followed by brainstorming, designing, building, testing and concludes with sharing or showcasing the task. The 4Ws canvas help in identifying the problem. The 4Ws include Who? What? Where? Why? The Who canvas tells us who are the stakeholders, i.e., who are suffering from the problem. The What canvas gives us the information about the nature of the problem. The Where canvas tells us where does the problem arise. The Why canvas tells us the benefits which the stakeholders would get from the solution and how it will benefit them as well as the society.
Stage 2: Data Acquisition
This stage is about data acquisition or acquiring data for the project. We acquire data from a variety of sources like API (application program interface), surveys, web scraping etc. The data that we collect in this stage is of different types. They are given as follows:
After collecting data, the next immediate vital step is curation or filtering of data. Using unfiltered and irrelevant data, will lead to meaningless and futile analysis.
Stage 3: Data exploration
After acquisition and curation of data, the next stage is data exploration. Data exploration refers to techniques and tools that are used for identifying important patterns and trends. We can do it either through data visualisation or by adopting sophisticated statistical methods.
Stage 4: Modelling
Based on trends and patterns regarding data exploration, the next important thing is to use data for making predictions or future forecasts. The graphical representation makes the data understandable for humans as we can discover trends and patterns out of it. But when it comes to machines, accessing and analysing data, it needs the data in the most basic form of number (binary 0s and 1s) and when it comes to discovering the pattern and trends in data, the machine goes in for mathematical representations of the same. The ability to mathematically describe the relationship between parameters is the heart of every AI model. Thus whenever we talk about developing AI models, it is the mathematical approach towards analysing data which we refer to. Generally, AI models can be classified as follows:
Stage 5: Evaluation
The last stage of an AI project cycle is evaluation and deployment. Once a model has been made and trained, it needs to go through proper testing so that one can calculate the efficiency and performance of the model. Hence, the model is tested with the help of Testing data and the efficiency of the model is calculated on the basis of the below parameters:
Accuracy
Precision
Recall
F1 Score
The evaluation stage help in:
• enhancing the research field
• identifying the fields that can be replicated
• identifying the fields with short comings, which can become the fields for further research
• collecting the knowledge for accumulation
The routine evaluation is carried out after each stage of a project cycle. The first evaluation takes place after the conclusion of the project. Routine evaluation makes sure that modifications are made immediately. Executing merely an individual evaluation after the project might result in failures and losses, in terms of money and time. Deployment of the project is the process that involves using the project for the purpose which has been visualised earlier. This purpose can be selling it to the end-user or using it for solving problems.
Question 2: Write short notes on:
a) Problem scoping
Answer: Problem scoping is the first stage in AI project cycle that begins after identifying the problem. This is where we identify the project and set goals that we want our AI project cycle to achieve. Problem scoping can be considered as a part of the problem-definition phase. The 4Ws canvas can be used for identifying the problem in the problem scoping stage. The 4Ws stand for Who? What? Where? Why? The Who canvas tells us who are the stakeholders, i.e., who are suffering from the problem. The What canvas tells the information about the nature of the problem. The Where canvas gives us the information about where the problem arises. The Why canvas tells us the benefits which the stakeholders would get from the solution and how it will benefit them as well as the society.
b) Data acquisition
Answer: Data acquisition is the second stage of AI project cycle which deals with acquiring data from a variety of sources like API [application program interface], surveys etc. for the AI cycle. The data that is collected in this stage is of different types. They are:
c) Data Exploration
Answer: This is the third stage of AI project cycle. It refers to the techniques and tools that are used for identifying important patterns and trends. We can do it by data visualisation or by adopting sophisticated statistical methods.
d) Modelling
Answer: This is the fourth stage of AI project cycle. Based on trends and patterns regarding data exploration, the next important thing is to use data for making predictions or future forecasts. The graphical representation makes the data understandable for humans as we can discover trends and patterns out of it. But when it comes to machines, accessing and analysing data, it needs the data in the most basic form of number [binary 0s and 1s] and when it to discovering the patterns and trends in data, the machine goes in for mathematical representations of the same. The ability to mathematically describe the relationship between parameters is the heart of every AI model. Thus whenever we talk about developing AI models, it is the mathematical approach towards analysing data which we refer to. Generally, AI models can be classified as:
e) Evaluation
Answer: Evaluation is the last stage of AI project cycle. Once a model has been made and trained it needs to go through proper testing so that one can calculate the efficiency and performance of the model. Hence the model is tested with the help of Testing data and the efficiency of the model is calculated on the basis of the given parameters:
Accuracy
Precision
Recall
F1 Score
The evaluation stage helps in:
• enhancing the research field
• identifying the field that can be replicated
• identifying the fields with short comings, which can become the fields for further research
• collecting the knowledge for accumulation
Question 3: Explain 4Ws problem canvas with an example.
Answer: The 4Ws helps in identifying the key elements related to the problem. The 4Ws stand for Who? What? Where? Why?
→ Who canvas
The who canvas helps in analysing the people who are getting affected directly or indirectly. Under this, we find out who the stakeholders to this problem are and what we know about them.
→ What canvas
The what canvas determines the nature of problem, what is the problem and how do you know that it is a problem? Under this, evidences are gathered to prove that the problem that has been selected is actually existing.
→ Where canvas
The where canvas helps to look into the situation in which the problem arises; the context of it, and the locations where it is prominent.
→ Why canvas
The why canvas helps us to understand who are the people who would be benefitted by solution; what is to be solved; and where will the solution be deployed.
The 4Ws Problem Canvas is like this:
The |
[stakeholders - people concerned with the problem] |
Who? |
Have a problem of |
[issue, problem] |
What? |
When/while |
[Context, situation] |
Where? |
An ideal solution would be |
[how will the solution help the stakeholders] |
Why? |
Example of 4w canvas of COVID-19 problem:
The |
Community |
Who? |
Have a problem of |
Spread of COVID-19 |
What? |
When/While |
Coming in contact with an infected person |
Where? |
An ideal solution would be |
Able to make a correct diagnosis based on the symptoms shown by the people so that they can take all precautions to prevent it from spreading |
Why? |
Question 4: List various ways to collect the required data for AI project.
Answer: Some ways through which the required data for AI project is acquired are:
• Surveys
• Web scraping
• Cameras
• Observations
• API (Application program interface) etc.
Question 5: Explain various classifications of AI model.
Answer: The AI models are classified into two categories:
1. Rule-based approach: It refers to the AI modelling where the rules are defined by the developer. It is the purest form of AI and limited in its ability to imitate intelligence. It is limited by the size of its underlying rule-base. The machine follows rules or instructions mentioned by the developer and performs its tasks accordingly. A drawback of this approach is that the learning is static. The machine once trained, doesn’t take in to consideration any changes made in the original data set.
2. Learning-based approach: It refers to the AI modelling where the machine learns by itself. Under the learning-based approach, the AI model gets trained on the data fed to it and then is able to design a model which is adaptive to change in data. Under learning-based approach, there are three subcategories:
i) Supervised learning: In supervised learning, the model is trained on a labelled dataset. It means that we have both raw input data as well as output. The training data set trains our network while the test data set act as new data for predicting the output or to see the accuracy of the model. In this type of learning, the model learns from the known result. This model performs at a fast pace because the time taken for the training is less. Under supervised learning, there are two subcategories:
• Classification: Where the data is classified according to the labels. This model works on discrete dataset which means the data need not be continuous.
• Regression: Such models work on continuous data.
ii) Unsupervised learning: In this type of learning, the input used to train is neither classified nor labelled. This means that the data which is fed to the machine is random and there is a possibility that the person who is training the model doesn’t have any information regarding it. The unsupervised learning are used to identify relationships, patterns and trends out of the data which is fed into it. It helps the user in understanding what the data is about and what are the major features identified by the machine in it. There are two subcategories in unsupervised learning:
• Clustering: It refers to the unsupervised learning algorithm which is can cluster the unknown data according to the patterns or trends out of it. The patterns observed might be the ones which are known to the developer or it might even come up with some unique patterns out of it.
• Dimensionality reduction: We humans are able to visualise up to 3-dimensions only but according to a lot of theories and algorithms, there are various entities which exist beyond 3-D; which means that we can't visualise them as they exist beyond our visualization ability. Hence to make sense out of it, we need to reduce their dimension. Here dimensionality reduction algorithm is used.
iii) Reinforcement learning: Machine learning algorithms help software agents and machines to determine the ideal behaviour within a particular context to enhance their working. There are no labelled data set or results associated with data in it. Therefore, the only way through which a given task can be performed is by learning from experience. For every right action or decision of an algorithm, it is rewarded with a positive reinforcement. On the other hand, for every wrong action, it is given a negative reinforcement. In this way, it learns about the nature of actions that needed to be performed and which need not to be done. This type of learning can assist in industrial automation.
Question 6: Explain rule based approach with an example.
Answer: It refers to the AI modelling where the rules are defined by the developer. It is the purest form of AI and limited in its ability to imitate intelligence. It is limited by the size of its underlying rule-base. The machine follows the rules or instructions mentioned by the developer and perform its task accordingly. A drawback of this approach is that the learning is static. The machine once trained, doesn’t take into consideration any changes made in the original training data set.
Example, we have a data set which tell us about the conditions on the basis of which we can decide if an elephant maybe spotted or not while on safari. The parameter are outlook, temperature, humidity and wind. Now let's take various possibilities of these parameters and see in which case the elephant may be spotted and in which case it may not. After looking through all the cases, we feed this data into the machine along with the rules which tell the machine all possibilities. The machine trains on this data and now is ready to be tested. While testing the machine, we tell the machine that outlook = overcast; temperature = normal; humidity = normal and wind = weak. On the basis of this testing dataset, now the machine will be able to tell if the elephant has been spotted before or not and will display the prediction to us.
Question 7: Explain the various methods in learning based approach.
Answer: Learning based approach refers to the AI modelling where the machine learns by itself. There are 3 subcategories in learning based approach:
1. Supervised learning: In supervised learning, the model is trained on unlabelled dataset. It means that we have both raw input data as well as output. The training dataset trains our network while the test dataset acts as new data for predicting the output or to see the accuracy of the model. In this type of learning, the model learns from the known result. This model performs at a fast pace because the time taken for the training is less. Under supervised learning there are 2 subcategories:
• Classification: Where the data is classified according to the labels. This model works on discrete dataset which means the data need not be continuous.
• Regression: Such models work on continuous data.
2. Unsupervised learning: In this type of learning, the input used to train is neither classified nor labelled. This means that the data which is fed to the machine is random and there is a possibility that the person who is training model doesn’t have any information regarding it. The unsupervised learning are used to identify relationships, patterns and trends out of the data which is fed into it. There are two subcategories in unsupervised learning:
• Clustering: It refers to the unsupervised learning algorithm which can cluster the unknown data according to the patterns or trends identified out of it. The patterns observed might be the ones which are known to the developer or it might even come up with some unique patterns out of it.
• Dimensionality reduction: We humans are able to visualise upto 3-dimensions only but according to a lot of theories and algorithms, there are various entities which exist beyond 3-dimensions; which means that we can’t visualise them as they exist beyond our visualisation ability. Hence to make sense out of it, we need to reduce their dimensions. Here, dimensionality reduction algorithm is used.
3. Reinforcement learning: Machine learning algorithms help software agents and machines to determine the ideal behaviour within a particular context to enhance their working. There are no labelled data set or results associated with data in it. Therefore, the only way through which a given task can be performed is by learning from experience. For every right action or decision of an algorithm, it is rewarded with a positive reinforcement. On the other hand, for every wrong action, it is given a negative reinforcement. In this way, it learns about the nature of actions that needed to be performed and which need not to be done. This type of learning can assist in industrial automation.
Question 8: Mention two/four disadvantages of rule based approach.
Answer: The disadvantages of rule-based approach are:
• A drawback of this approach is that the learning is static. The machine once trained, doesn’t take into consideration any changes made in the original training dataset.
• The maintenance of these systems is time-consuming and costly.
• It is challenging to complement an already extensive knowledge base with new rules without introducing contradicting rules.
Question 9: Write brief notes on:
a) Supervised learning
b) Unsupervised learning
Answer:
a) Supervised Learning
In supervised learning, the model is trained on a labelled dataset. It means that we have both raw input data as well as output. The training dataset trains our network while the test dataset acts as new data for predicting the output or to see accuracy of the model. In this type of learning, the model learns from known result. This model performs at a faster pace because the time taken for training is less. There are two subcategories in supervised learning:
• Classification
• Regression
b) Unsupervised Learning
In this type of learning, the input used to train is neither classified nor labelled. This means that the data which is fed to the machine is random and there is a possibility that the person who is training model does not have any information regarding it. The unsupervised learning are used to identify relationships, patterns and trends out of the data which is fed into it. There are two subcategories in unsupervised learning:
• Clustering
• Dimensionality reduction
Question 10: Explain the term Neural Network with diagram.
Answer: The neural network is based on the imitation of the human brain neurons. The technique of neural network is used in creating computer programs that learn from data. A neural network is essentially a system of organizing machine learning algorithms to perform certain tasks. It is a fast and efficient way to solve problems for which data set is very large, such as in images. The larger neural networks tend to perform better with large amounts of data whereas the traditional machine learning algorithm stop improving after a certain saturation point. The image that depicts neural network is as follows:
Question 11: Mention the features of neural network.
Answer: Features of artificial neural network:
• The key feature of neural networks are that they are able to extract data features automatically without needing the input of the programmer.
• Neural network systems are modelled on the human brain and nervous system.
• Every neural network node is essentially a machine learning algorithm.
• It is useful when solving problems for which the data set is very large.
Question 12: Differentiate between rule-based approach and learning-based approach.
Answer:
Rule-based approach |
Learning-based approach |
In this approach, the machine follows rules or instructions mentioned by the developer. |
In this approach, the machine learns by itself. |
In this approach, AI is achieved through rule-based technique. |
In this approach, AI is achieved through machine learning technique. |
The machine once trained, doesn’t take into consideration any changes made in the original dataset. |
This type of approach, facilitates the acquisition of new knowledge. |
Question 13: Mention the various approaches used to evaluate the efficiency of a model.
Answer: The various approaches used to evaluate the efficiency of the model are
Accuracy
Precision
Recall
F1 score
tytyty !
ReplyDeleteHi,
DeleteThank you for visiting our site!
If you found our website helpful, you can share it with your friends! You can also follow us on:
• Facebook: https://business.facebook.com/EducationWithVS
• Twitter: https://twitter.com/EducationWithVS
• Pinterest: https://in.pinterest.com/EducationWithVS