Applied Modeling Techniques and Data Analysis 2. Группа авторов

Applied Modeling Techniques and Data Analysis 2

them to the tax claim. More formally, both of the ratios

and

are computed. Then, the minimum between these two ratios and 1 is taken. That is, the variable Z value, which thus ranges from 0 to 1.

Now, for both tax claim (TC) and Z, we calculate the 25th percentile (Q1), the median value (Q₂) and the 75th percentile (Q₃). We then state that a taxpayer may be considered interesting if he satisfies one of the following conditions:

The three above-mentioned rules can be represented as in Figure 1.3.

Bar chart depicts the interesting and not interesting taxpayers.

Figure 1.3. Determining interesting and not interesting taxpayers. For a color version of this figure, see www.iste.co.uk/dimotikalis/analysis2.zip

Once the population of our dataset is entirely divided into interesting and not interesting taxpayers, we can see from Table 1.1 that the interesting ones are far more profitable than the others (tax claim values are in thousands of euros). A machine learning tool able to distinguish these two kinds of taxpayers fairly well would then be very useful.

Our first model task will then be that of identifying, with a certain confidence degree, the taxpayers who are more likely to have evaded (both in absolute terms and as a percentage of revenues or turnover).

The literature on tax fraud detection, although using different methods and algorithms, is usually only concerned about this issue, i.e. in finding the best way to identify the most relevant cases of tax evasion (Bonchi et al. 1999; Wu et al. 2012; Gonzalez and J.D. Velasquez 2013; de Roux et al. 2018).

There is another crucial issue that has to be taken into account, i.e. the effective tax authorities’ ability to collect the tax debt arising from the tax notices sent to all of the unfaithful taxpayers. Table 1.1. Tax claim, interesting and not interesting taxpayers

Table 1.1. Tax claim, interesting and not interesting taxpayers

	Not interesting			Interesting
Tax claim	Num	Total tax claim	Average	Num	Total tax claim	Average
[0 - 1]	736	322	0.44	0	0	0.00
[1 - 2]	631	942	1.49	0	0	0.00
[2 - 5]	1,607	5,409	3.37	138	563	4.08
[5 - 10]	1,127	7,727	6.86	517	4,157	8.04
[10 - 20]	446	5,911	13.25	902	13,139	14.57
[20 - 50]	0	0	0.00	1,164	36,056	30.98
[50 - 100]	0	0	0.00	433	30,055	69.41
[100+]	0	0	0.00	327	101,987	311.89
Total	4,547	20,311	4.47	3,481	185,957	53.42

1.2.3. Enforced tax recovery proceedings

What happens if a taxpayer does not spontaneously pay the additional tax amount he is charged? Well, after a while, coercive collection procedures will be deployed by the tax authorities. However, as we have seen above, these procedures are highly ineffective, as they only collect about the 5% of the overall credits claimed against the audited taxpayers.

Indeed, data shows that coercive procedures take place in almost 40% of cases, although its distribution is not uniform: they are more frequent if the tax bill is high, as reported in Table 1.2 (again, tax claim values are in thousands of euros).

Table 1.2. Number of coercive procedures per tax claim interval

Tax claim	Coercive procedures		Total
	No	Yes
[0 - 1]	578	158	736
[1 - 2]	476	155	631
[2 - 5]	1,268	477	1,745
[5 - 10]	1,072	572	1,644
[10 - 20]	745	Скачать книгу В начало < 4 5 6 7 8 9 10 11 12 13 > В конец Librs.Net