Paper Title* (use style: Signature Verification)

,

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

Authors Name/s per 1st Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired
.0/Authors Name/s per 2nd Affiliation (Author)
line 1 (of Affiliation): dept. name of organization
line 2-name of organization, acronyms acceptable
line 3-City, Country
line 4-e-mail address if desired

ABSTRACT
Phishing is a form of acquiring sensitive information illegally in network attack such as Banking, User names, Passwords, Credit card details and so on from users. Attackers use a combination of website imitating and social engineering to trick a user into revealing private information. As the present technologies are improving better, the users behind phishing scams also become more dubious. Phishing attacks mostly appear as spoofed emails appearing as permissible ones which make the users to believe and reveal into them by clicking their links provided in emails. This paper presents how to avoid the phishing scams, how it is attacked. The intend is a new final-user based on anti-phishing algorithm which we call “Link Guard” algorithm. Link Guard can detect not only scandalous but also unfamiliar phishing attacks. We had implemented Link Guard in windows XP. Our tests verified that Link Guard is successful in detecting and preventing attacks. 

GENERAL TERMS
Algorithms, Experimentation, Theory 
KEYWORDS
Biometrics, Neural Network, Feature Extraction, Normalized area of signature, center of mass

INTRODUCTION
Recently whole world is under the attack of phishing attacks. A lot of such cyber attacks on news channels where credit card’s credentials are taken from the users by compelling them to enter the credentials like username, password, credit card details onto a fake website which looks exactly like some legitimate bank website. 
Phishing is the best way that lawbreakers get all sensitive information what they need. The term is a alternate of fishing probably influenced by phreaking. There are some lists of phishing methods. 

Phishing
Spear Phishing  
Clone Phishing 
Whaling Phishing  

 

The commonly used attack method is to send emails to victims, which claims to be sent from banks, online agencies. These emails they will make up some source, example the code of your credit card had been mis-entered for many times or they are providing other services to draw you visit their website to confirm or modify your account number and password through the link provided in the email. You will then be linked to a counterfeited website after clicking those links. The design, the functions performed and even the URL of these faked website appear so real that you cannot differentiate the fake website from the real one. When a person inputs the account number and password, the attacker then successfully collect the credentials at the server side and is able to perform their next step actions with that credentials such as extract money out from your account

APPROACHES TO PREVENT PHISHING
There are many ways from which prevent phishing attacks: 1) teach users to understand how phishing attacks works and a warning pops when a phishing mail is receievd; 2) use legal methods to discipline phishing attackers; 3) use technical methods to stop phishing attackers. In this paper, the focus is on the third one.If one or several of the steps that needed by a phishing attack is cut off, we then successfully prevent that attack. In what follows, we briefly review these approaches:

1) Detect and block the phishing Web sites in time:If we can notice the phishing Web sites in time, we then can block the sites and prevent phishing attacks. It’s relatively easy to just determine whether a site is a phishing site or not, but it’s tough to find those phishing sites out in time. Here we list two methods for phishing site detection. 1) The Web master of a legal Web site timely scans the root DNS for suspicious sites (e.g. www.1cbc.com.cn vs. www.icbc.com.cn

2) Since the attacker must duplicate the content of the target site, he must use tools to (automatically) download the Web pages from the target site. It is then possible to detect this kind of download at the Web server and trace back to the attacker. Both approaches have shortcomings. For DNS scanning, it increases the overhead of the DNS systems and may cause problem for normal DNS queries, and furthermore, many phishing attacks simply do not require a DNS name. For phishing download detection, clever attackers may easily write tools which can imitate the behavior of human beings to defeat the detection.  

2) Enhance the security of the web sites: The business Web sites such as the Web sites of banks can take new methods to guarantee the security of users’ personal information. One method to enhance the security is to use hardware devices. For example, Pnb bank provides a hand-held card reader to the users. Before shopping in the net, users need to insert their credit card into the card reader, and input their (personal identification number) PIN code, then the card reader will generate a onetime security password, customers can then do transactions only after the right password is input  . Another method is to use the biometrics typical (e.g. voice, fingerprint, iris,  etc.)  for  user  authentication. For  example, Paypal had tried to replace the single password proof by voice recognition to enhance the security of the Web site. With these methods, the attackers cannot accomplish their tasks even.

3) Block  the  phishing e-mails  by  various spam  filters: Attackers generally use e-mails to pull potential victims. SMTP (Simple Mail Transfer Protocol) is the protocol to deliver e-mails in the Internet. It is a very simple protocol which  lacks  compulsory  authentication mechanisms. Information of th sender, such as the name and email address of the sender, route of the message, etc., can be forged in SMTP. Thus, the attackers can send out large amounts of spoofed e-mails which are seemed from legal organizations. The attackers hide their identities when sending the spoofed e-mails, therefore, if anti-spam systems can tell whether an e-mail is sent by the proclaimed sender, the phishing attacks will be decreased. At this time, the techniques that preventing senders from counterfeiting their Send ID (e.g. SIDF of Microsoft 6) can defeat phishing attacks efficiently .
4) Install online anti-phishing software in user’s computers:  Despite all the above efforts, it is still possible that a user may  visit  the  fake  Web  sites.  Customers can install anti-phishing tools in their computers. The anti- phishing tools in use today can be divided into two categories, blacklist/whitelist based and rule-based.  

IV.RELATED WORKS

4.1CATEGORIES OF HYPERLINKS USED IN PHISHING ATTACKS:

A   In general the structure of the hyperlinks is as follows: 
anchor text  
where denotes for uniform resource identifier. URI basically provides the resource information about the hyperlink and the anchor text provides the information about the URI. We could only see the anchor text and the URI is hidden. So the attackers takes advantage of this point and succeeds in their agenda. Let us call the URI in the hyperlink actual link and the anchor text the visible link. 
The following 5 categories of hyperlinks used by the attackers in the phishing attacks can be seen: 

Category1: The hyperlinks provide the DNS domain name in the visible link but the visible link in the hyperlink doesn’t match with the real link. For example the hyperlink www.onlinepnbda.com seems to be linked to PNB online net banking but it is actually linked to a phishing site www.fakephish.com.

Category2:In place of DNS domain name, dotted IP address is used in the URI or anchor text. For example click here .

Category3: Now-a-days attackers use encoded hyperlink to ploy the users. a. The DNS name in the hyperlink is encoded into their corresponding ASCII codes. Consider the linkwww.onlinepnb.com. Here it seems that it is linked to the online Pnb site but it is actually linked to a phishing site http://74.134.75.81:54/l/index.htm. b. Attackers also use special characters (such as $ in the visible link) in order to make the users believe that the email is sent from some authorised site. 

Category4: Sometimes the hyperlink doesn’t give the destination details in its anchor text and uses DNS name in its URI. The DNS name in the URI is alike to some companies or organization. For instance consider the link Click here to confirm your account. This seems to be sent from Pnb online, but is actually registered by the attackers to let the users believe that it has something to do with Pnb.

Category5: The attackers also utilize the vulnerabilities of the target website to redirect users to their phishing sites. For instance the following link: Click here. Once clicked on the link this will redirect the user to the phishing site 206.241.225.10 due to vulnerability of passport.india.gov.in.  

 4.2THE LINKGUARD ALGORITHM 

LinkGuard works by analyzing the differences between the visible link and the actual link. It also calculates the alikeness of URI with a known trusted site. The algorithm is stated under. The given keywords are used in the algorithm: 
v_link: visual link;
 a_link: actual link;
 v_dns: visual DNS name;
 a_dns: actual DNS name; 
sender_dns: sender’s DNS name;

4.3The Working of linkguard: 
The LinkGuard algorithm works as follows. 
Step1: In its main routine LinkGuard, it first extracts the DNS names from the actual and the visual links. If these names are not the alike while comparing actual ins and visual dns, then it is phishing of category 1
 Step2: It then inspects for dotted decimal, If dotted decimal IP address is immediately used in actual DNS , then a category 2 phishing attack is feasible.
 Step3: Then if the actual link or the visual link is concealed (categories 3 and 4), we first decode the links, then recursively call LinkGuard to return a result.
 Step4: If the visual link has no destination details (category 5), LinkGuard calls AnalyzeDNS to analyze the actual DNS. LinkGuard therefore handles all the 5 categories of phishing attacks.

4.4DRAWBACKS OF LINKGUARD ALGORITHM:  

It will consider www.pantajaliindia.com.cn (phishing site) as a normal site if the user had never visited www.pantajaliindia.com.cn (e-commerce website) which results in false-negative. 
www.iee.org and www.ieee.org both are legitimate sites but LinkGuard will consider 1st website a phishing site as both DNS names have a similarity index of 3/4, resulting in false-positive.

5PROPOSED APPROACH: 
In this section we outline the proposed approach and the algorithm proposed that will tell how this approach is better than the existing approach.  5.1The PWDAP(Phishing website Detection and Prevention) Algorithm works first by scanning the difference between the visual link and the actual link and then to reduce the false positives and negatives it works by calculating the weights of different terms and then comparing the phishing site with the legitimate site based on the highly scored terms.     The method uses TF-IDF for detecting phishing sites. TF-IDF is a well known information retrieval algorithm that can be used for comparing and organising documents, as well as retrieving papers from a large offices. In this we evaluate how TF-IDF works. 
 
5.2TF-IDF ALGORITHM: 
• TF-IDF is an algorithm often used in information retrieval and text mining. TF-IDF produces a weight that measures how important a word is to a document in a collection. The significance adds to the number of times a word will appear in the specified document, but is neutralised by the frequency of the word in the collection. 
• The term frequency (TF) is simply the number of times a given term appears in a specific document. The total is usually decreased to prevent a partiality to longer documents (which may have a higher term frequency regardless of the actual importance of that term in the document) to give an amount of the importance of the expreesion within that specified document. The inverse document frequency (IDF) is a measure of the general importance of the term. The IDF counts  how much occurrences of a term in a collection of documents.
 • Thus, an expression  having a high TF-IDF weight by having a high term frequency in a given document (i.e. a word is common in a document) and a low document occurrence in the whole collection of documents (i.e. is relatively uncommon in other documents). The flowchart for the PWDAP algorithm is drawn below: 
The keywords are used in the algorithm: v_link: visual link; 
a_link: actual link;
 v_dns: visual DNS name;
 a_dns: actual DNS name;

5.3THE WORKING OF PWDAP: 
The PWDAP algorithm works as follows: 
Step1: In its main routine PWDAP, it first obtains the DNS names from the actual and the visual links. It then compares the actual and visual DNS names, if these names are not the same, then it is phishing of category 1. 
 Step2: It then checks for dotted decimal, If in actual DNS dotted decimal IP addresses used, it is then a possible phishing attack of category 2.
  Step3: Then if the actual link or the visual link is encoded (categories 3 and 4), we first decode the links, then recursively call LinkGuard to return a result.
  Step4: When there is no destination information (DNS name or dotted IP address) in the visual link (category 5), LinkGuard calls AnalyzeDNS to analyze the actual DNS. LinkGuard therefore handles all the 5 categories of phishing attacks. Functions of Subroutines: 
 In  the subroutine AnalyzeDNS, if the actual DNS name is contained in the blacklist, then we are sure that it is a phishing attack.  Similarly, if the actual DNS is contained in the whitelist, it is therefore not a phishing attack . If the actual DNS is not contained in either whitelist or blacklist, then 
Step5: Calculate the TF-IDF scores of each term on that web page Step6: then generate a set by taking the five terms with highest TF-IDF weights Step7: then feed this set to a search engine, which in the case is Google 
Step8: If the domain name of the current web page matches the domain name of the N top search results, it will be considered a legitimate web site. Otherwise, it will be considered a phishing site.   

 
 6CONCLUSION:
 In this work, I proposed a new approach to detect phishing websites by combining the approach based on the characteristics of hyperlinks used in the phishing attacks and the content analysis approach using TF-IDF algorithm and thus removing all the drawbacks from the existing algorithm known as LinkGuard algorithm. Thus using the PWDAP algorithm the problem of false-positives and false-negatives in the LinkGuard has removed and the accuracy of LinkGuard has improved.