Intelligent Data Analytics for Terror Threat Prediction. Группа авторов
classifies as large margin in between two types of data: first one is in circle shape and the second one is in triangle shape. These two data points have been classified with maximum distance (thick line) between them. The large margin shown in Figure 1.5(a) says that it is classifying those circles and triangles equally from that point, which means distance between those two data types is maximum through that margin. As shown in Figure 1.5(b), SVM also supports multi-dimensional data.
Figure 1.5 Hyperplane in 2-D and 3-D.
1.4.1.2.1 Cost Function and Gradient Features
SVM algorithm looks to maximize the margin between the data points and the hyper plane. The loss function that helps maximize the margin is hinge loss [8] and is defined as follows:
(1.2)
If predicted value and expected value have the same sign then the cost function is 0.
1.4.2 Combating Misinformation on Instagram
Classification of shared contents by users in social media is prevalent in combating misinformation. Baseline classification algorithms like Naïve Bayes theorem and SVM models have been used extensively for detecting rumor as discussed Section 1.4. Even though these algorithms classify rumors and facts in some manner, still there is a need to come up with some excellent techniques which may improve efficiency in rumor classification. Nowadays, social networks like Facebook, WhatsApp, Instagram and Twitter are using good techniques, but still they failed to classify the rumors exactly.
One of the popular social network, Facebook, has started in Instagram application (in US) to detect whether given post contains fact-information or false-information through some third party called as fact-checkers [33]. These third-party-fact-checkers are located globally and find rate of fact and false about particular post. When something is wrong in any post immediately fact-checkers check ratio of fact or misinformation.
If any post contains more false ratio then immediately it labels as “False information” otherwise no. Now it is the user’s responsibility to view or not that particular post based on false ratio and fact ratio, about share to their friends, communities or not. Using third-party-fact-checkers, Instagram is trying to combat misinformation on social networks. Figure 1.6 will give you brief idea about this method.
Figure 1.6 Combating misinformation in Instagram [33].
1.5 Factors to Detect Rumor Source
Rumor detection is not only a solution to prevent these cyber-crimes in social media, but finding source plays an important role to prevent further diffusion and punish the culprit. Initially, finding source of rumors in network discussed by Ref. [9]. Later, much research has been done and has introduced several factors which are to be considered in RS identification. There are mainly four factors considered namely, diffusion models, network structure, evaluation metrics, and centrality measures. Each factor has been explained in the following section with examples. After rumor detection, consider these factors and find rumor source using source detection methods in social networks are explained in Section 1.5.2.
1.5.1 Network Structure
Network structure can be derived from two parameters: network topology and network observation [9]. Network topology describes the structure of network either in tree or graph. Source identification is more complex in the graph topology than tree topology, as tree has exactly one root node and no loops are allowed, Graph doesn’t have any root node and loops are allowed in network. Network observation is the second type of network structure and it is useful to observe the network during rumor propagation to get the knowledge about states of nodes in particular time. Network can be observed possibly in following three ways [11]: Complete, Snapshot and Monitor.
1.5.1.1 Network Topology
In computer networks, network topology is defined as design of physical and logical network. Physical design is the actual design of the computer cables and other network devices. The logical design is the way in which the network appears to the devices that use it.
In complex networks, network topology is the arrangement of network in generic graph or tree. In general, many domains like medical, security, pipeline of water, gas, and power grid are available in graph structure. These graphs are required to restructure two topologies as d-regular trees and random geometric trees [34]. Initially, rumor source identification is discussed and introduces methods for general trees and general graphs based on rumor source estimator. Rumor source estimator plays a key role in finding the exact source of rumor. Source estimator mainly based on Maximum likelihood (ML) estimation is the same as a combinatorial problem [9, 35]. The following section will explain required techniques such as rumor source estimator, ML estimator, rumor centrality, and message passing algorithms to detect rumor source in trees.
1.5.1.2 Network Observation
In rumor source identification, network structure plays an important role. When structure of network is known, it is easy to find how a rumor is spread in network using diffusion models such as SI, SIS, SIR and SIRS. If back track these diffusion models then rumor source can be detected easily. To know the structure of network another model is used called network observation, which provides information about states of each node present in network at particular time. Those states are in a susceptible node—able to being infected, infected node—that can widen the rumor more while recovered node—that is alleviates and no longer infected [10]. If information of each node likely is susceptible, infected or recovered is observed then it is easy to generate structure of network from that knowledge. Network observation can be done in three ways: complete observation, snapshot observation and monitor observation.
1.5.1.2.1 Complete Observation
Complete inspection of network presents broad information like whether a node is susceptible, infected or recovered at each time of interval in network [11]. It is not enough to know about state of node at one time only and requires multiple time intervals. Complete observation will give this knowledge even in different time intervals. Complete observation of small scale network is easy as size of network is small but it is hardly possible in large scale networks. Figure 1.7 depicts knowledge about this problem, as shown in Figure 1.7(a) regular tree with 7 nodes considered as small scale network and complete observation of network can be possible like root node, leaf nodes, degree of nodes, etc. In Figure 1.7(b) a generic graph is shown with many nodes and multiple connections between each node treated as large scale network and observation is not easy as finding the root node, leaf nodes and degree of nodes are difficult in these kind of large scale