- Preparing the tools
- Preparing data
- Building the model
- Predicting sentiment texts
Preparing the tools
We'll create a new console application (.NET Framework) project in Visual Studio. Then install Microsoft.ML packages from Nuget package manager. Here in my machine, I use Visual Studio with 4.7.1 .NET framework and Microsoft.ML 0.11 version. Since many updates are coming with ML.NET, make sure that you are using the right one.
We'll include the required namespaces.
using Microsoft.ML; using Microsoft.ML.Data; using System; using System.Collections.Generic; using System.Linq;
Preparing data
I prepared a simple sentiment text for this tutorial to train the model. It is imaginary users opinion that positive opinion labeled '1' and negative opinion with '0'. It is a tab separated text file with a binary label and sentiment text. The below is sample content of sentiment training data.
Label Text 1 exciting show 1 amazing performance 1 it is great! 1 I am excited a lot 1 it is terrific 1 Definitely good one 1 Excellent, very satisfied 1 Glad we went 1 Once again outstanding! 1 awesome! excellent show 1 This is truly a good one! 0 it's mediocre! 0 Not good at all! 0 It is rude 0 I don't like this type 0 poor performance 0 Boring, not good at all! 0 not liked 0 I hate this type of things
...
You can find the full list of above sentiment text in below. Copy the text and save it as a SentimentText.tsv on your target folder. When you save the sentiment content, make sure that the Label and Text columns are separated by a tab.
private static readonly string DataPath = @"C://tmp/SentimentText.tsv";
Building the model
First, we need to create sentiment data and prediction container classes.
public class SentimentIssue { [LoadColumn(0)] public bool Label { get; set; } [LoadColumn(1)] public string Text { get; set; } } public class SentimentPrediction { [ColumnName("PredictedLabel")] public bool Prediction { get; set; } public float Probability { get; set; } public float Score { get; set; } }
We'll start by creating MLContext.
var mlContext = new MLContext(seed: 1);
Then, we'll read the sentiment text file and transform it into the mlContext.
var data = mlContext.Data.LoadFromTextFile( path: DataPath, hasHeader: true, separatorChar: '\t'); var dataProcessPipeLine = mlContext.Transforms.Text .FeaturizeText(outputColumnName: DefaultColumnNames.Features, inputColumnName: nameof(SentimentIssue.Text));
We use StochasticDualCoordinateAscent model for binary classification.
var trainingPipeLine = dataProcessPipeLine
.Append(mlContext.BinaryClassification
.Trainers.StochasticDualCoordinateAscent());
We'll check the model with cross-validation and get accuracy.
var cvResults = mlContext.BinaryClassification .CrossValidate(data, estimator: trainingPipeLine, numFolds: 5); var aucs = cvResults.Select(r => r.Metrics.Auc); var accs = cvResults.Select(r => r.Metrics.Accuracy);
Finally, train fit the model.
var model = trainingPipeLine.Fit(data);
Predicting sentiment text
We can predict new sentiment data as shown below.
var predEngine = mlContext.Model
.CreatePredictionEngine(model);
var resultprediction = predEngine.Predict(item);
Here, I've tested the model with below sentiment texts.
var opinions = new List { new SentimentIssue {Text = "This is an awful!"}, new SentimentIssue {Text = "This is excellent!"}, new SentimentIssue {Text = "I like it!"}, new SentimentIssue {Text = "don't like this one"}, };
And the result was as a following.
The result looks better. We can improve our model by adding more train data.
In this post, we've learned how to classify sentiment data with ML.NET in C#.
The full source code and test sentiment text are listed below.
using Microsoft.ML; using Microsoft.ML.Data; using System; using System.Collections.Generic; using System.Linq; namespace SentimentDT { public class SentimentIssue { [LoadColumn(0)] public bool Label { get; set; } [LoadColumn(1)] public string Text { get; set; } } public class SentimentPrediction { [ColumnName("PredictedLabel")] public bool Prediction { get; set; } public float Probability { get; set; } public float Score { get; set; } } class Program { private static readonly string DataPath = @"C://tmp/SentimentText.tsv"; static void Main(string[] args) { var mlContext = new MLContext(seed: 1); var sentimentModel = BuildSentimentModel(mlContext); var opinions = new List { new SentimentIssue {Text = "This is an awful!"}, new SentimentIssue {Text = "This is excellent!"}, new SentimentIssue {Text = "I like it!"}, new SentimentIssue {Text = "don't like this one"}, }; PredictSentiment(mlContext, sentimentModel, opinions); Console.ReadKey(); } private static ITransformer BuildSentimentModel(MLContext mlContext) { var data = mlContext.Data.LoadFromTextFile( path: DataPath, hasHeader: true, separatorChar: '\t'); var dataProcessPipeLine = mlContext.Transforms.Text .FeaturizeText(outputColumnName: DefaultColumnNames.Features, inputColumnName: nameof(SentimentIssue.Text)); var trainingPipeLine = dataProcessPipeLine .Append(mlContext.BinaryClassification .Trainers.StochasticDualCoordinateAscent()); var cvResults = mlContext.BinaryClassification .CrossValidate(data, estimator: trainingPipeLine, numFolds: 5); var aucs = cvResults.Select(r => r.Metrics.Auc); var accs = cvResults.Select(r => r.Metrics.Accuracy); var model = trainingPipeLine.Fit(data); Console.WriteLine("Model accuracy info:"); Console.WriteLine($"Accuracy: {accs.Average()},AUC: {aucs.Average()}"); return model; } private static void PredictSentiment(MLContext mlContext, ITransformer model, List texts) { var predEngine = mlContext.Model .CreatePredictionEngine(model); Console.WriteLine("\nText | Prediction | Positive probability"); foreach (var item in texts) { var resultprediction = predEngine.Predict(item); var predSentiment = Convert .ToBoolean(resultprediction.Prediction)
? "Positive" : "Negative"; Console.WriteLine("{0} | {1} | {2}", item.Text, predSentiment, resultprediction.Probability); } } } }
SentimentText.tsv file content.
Label Text 1 I like it 1 like it a lot 1 It's really good 1 Recommend! I really enjoyed! 1 It's really good 1 recommend too 1 outstanding performance 1 it's good! recommend! 1 Great! 1 really good. Definitely, recommend! 1 It is fun 1 Exceptional! liked a lot! 1 highly recommend this 1 fantastic show 1 exciting, liked. 1 it's ok 1 exciting show 1 amazing performance 1 it is great! 1 I am excited a lot 1 it is terrific 1 Definitely good one 1 Excellent, very satisfied 1 Glad we went 1 Once again outstanding! 1 awesome! excellent show 1 This is truly a good one! 0 it's mediocre! 0 Not good at all! 0 It is rude 0 I don't like this type 0 poor performance 0 Boring, not good at all! 0 not liked 0 I hate this type of things 0 not recommend, not satisfied 0 not enjoyed, I don't recommend this. 0 disgusting movie 0 waste of time, poor show 0 feel tired after watching this 0 horrible performance 0 not so good 0 so boring I fell asleep 0 a bit strange 0 terrible! I did not expect. 0 This is an awful 0 Nasty and horrible! 0 Offensive, it is crap! 0 Disappointing! not liked.
Reference:
No comments:
Post a Comment