Scoring Machine system. Testing the scoring model
Testing the scoring model
The "Testing the scoring model" block contains sections:- Creation of a new scoring model test;- List of tests of scoring models;- Settings for testing scoring models.
In the "Settings for testing scoring models" section, you can configure the parameters for building a scoring model test.
Here you can change the number of lines into which the result of testing the scoring model will be divided. The results will be grouped by the specified number of rows. But we recommend not to increase the number of lines too much in order to better see the big picture. We recommend using 10 to 25 lines maximum. And also here you can also, as in the settings for creating a scoring, change how the system should understand a good result and a bad result in your file in the first column.
In the "Creating a new scoring model test" section, the test is created for the active scoring model. You will see on the screen which scoring model is currently active and for which the test will be performed.
It is worth noting that, depending on the type of subscription, you will have a limited number of saved tests for one model. And this means that if the user has reached the limit, then before starting to build a new test, the test that was created earlier than the others will be deleted. Therefore, we recommend that you control relevant and outdated tests yourself and delete outdated tests. The limit in this case is for each model separately, which means if the user has a limit of tests for one model - 20, and there are 10 models, then the user can save up to 20 tests for one model, in total 200 saved tests.
To create a test, you need to select a file with data for analysis with the extension .xls or .xlsx on your computer and upload it to the Scoring Machine system. Then click on "Create a new test". The system will make a series of checks on the file, if any inconsistencies in the requirements for the file are found during the first observation, an error will be displayed. After checking the file for a short time, the Scoring Machine will start analyzing the file and display a notification.
The duration of creating a test depends on the amount of data to be analyzed. The larger the data file, the longer the system will analyze and create the test. This means that if an analysis is being carried out for a specific model and the user already sees exactly which attributes are used in the model in one way or another, then you can safely delete unnecessary columns.
A very important process before creating a test is preparing the file correctly so that the system can analyze it better.In addition to the general requirements and recommendations for the file, the user should pay attention that the file for the test should contain the same attributes that are selected in the model and the attributes (columns in the file) should be called, and the attribute values themselves should be the same as they are selected in the model, and hence were called during the construction of the model.
All file requirements can be found here.
After the Scoring Machine finishes creating a test, it will immediately be displayed in the general list of tests, and an email will be sent to the user's email informing them that the test creation process is complete. If the test is not created for some reason (for example, errors in the file, the presence of empty cells where they should not be), then an e-mail will also be sent to the user about this.
The "List of tests of scoring models" section displays already created and saved tests in Scoring Machine. The results are displayed on 10 tests per page, to go to the next / previous / last / first page, you must click on the corresponding symbol at the bottom of the table.
To find a specific test, if there are many, you can use the search. To do this, click on the "Search" button in the upper right corner.
To go to the needed test, just click on the line in the table with this test.
When switching to the test, the name of the scoring model to which the test was proccessed, the name and description of the test, the final result of the Gini Index will be displayed.
The name of the test and its description are specified by the user at will. These values are only needed by the user in order to navigate what kind of test it is at one stage or another. You can change the name of the test and / or description through the "Actions". It is recommended to change them immediately after creating the test.
Also below on the page in the table the test results are displayed. The test results are grouped by the number of rows that was specified in the test settings at the time of the test.
More details about the values in the table according to the test results:
1. Count of scores – this column displays information on those records in the file that scored the specified number of points.2. Total items – the total number of records in the file with the specified number of points scored.3. Count of Good – the number of entries marked as "good" in the file with the specified number of points scored.4. Count of Bad - the number of records marked as "bad" in the file with the specified number of points scored.5. Bad Rate, % - proportion of bad ones in relation to the total number of entries with a specified number of points scored.6. Cum. Total count – the total number of cumulative records. Number of entries in current line + all previous lines.7. Cum. Total, % - proportion of the total with a cumulative total of the total number of records in the entire file.8. Cum. Good count - the number of good entries as a cumulative total. Number of good entries in the current line + all previous lines.9. Cum. Good, % - proportion of the number of good records with a cumulative total of the total number of good records in the entire file.10. Cum. Bad count - the number of bad records as a cumulative total. Number of bad entries in current line + all previous lines.11. Cum. Bad, % - proportion of the number of bad records with a cumulative total of the total number of bad records in the entire file.12. Gini Index – the main indicator of the predictive power of the model. It is important to pay attention to the total Gini at the very bottom of the table. The higher the result, the better and higher the predictive power of the model.A sufficiently high-quality model can be called if testing shows a Gini result of 30% or more. If the result is lower, then the model, as a rule, does not make sense to use.But what predictive power should be still much depends on the field of activity and the ability to select attributes for analysis.If the model is with a Gini result above 50% or 60%, then this is already a fairly strong predictive model for almost any field of activity.
With "Actions" you can edit the name and description, export all test data to an excel file, delete the test.
Based on the results of testing, it is necessary to analyze it and determine whether the model is good enough or whether it still needs to be refined and rebuilt, and also to determine what decisions can be made based on this model.More details here.