Jonathan Tanner CS100W Blog: 2013

Sunday, December 15, 2013

Scientific Computing: Modeling Neuroplasticity

The pursuit of strong artificial intelligence involves numerous areas of research. While a lot of current research in AI is focused on achieving various intelligent tasks that the brain is capable of, disciplines like computational neuroscience seek to understand how the brain works and achieves certain tasks in general. In a recent study from the field, researchers at MIT have been able to model how the mind is able to learn new things, known as neuroplasticity, while still retaining old things it's learned.

The research has shown that neurons are constantly trying out new configurations of how they connect to other neurons to allow for the brain to learn as many tasks as it needs to and find the best configuration. This allows neurons to specialize in certain tasks while others are still able to learn new tasks.

One key element in the study that was as yet not widely explored was determining how noise acts within the model. The researchers found that noise could actually benefit the model by exploring more new connection configurations when the model is hyperplastic. The researchers concluded that the noise actually helped the model learn a variety of new things while retaining the ability to do old rather than hindering it. The model also helps explain how skills can diminish when not practiced often enough since the new connections will eventually start to overwrite old skills after too much time has elapsed.

Only time will tell if this research will lead to further research and findings on the subject or just remain an interesting fact. Regardless, any breakthrough such as this in computational neuroscience helps towards both our understanding of how the human brain works in general as well as the long term goal of trying to create an intelligence on par with it.

Computer Graphics: Breakthough in Image Pattern Detection

In a research experiment involving unsupervised machine learning, Google may have discovered the most significant image pattern on the internet, or at least on Youtube where the experiment was performed. The system was designed to detect and rank imagery patterns which could then be analyzed for what they represent by researchers. It may come at no surprise, then, that this system succeeded in detecting what is important to the users of Youtube and most of the internet in general: cats.

What was originally designed to detect significant patterns in imagery data became the world's first cat detector after the results showed that imagery of cats was some of the most detected and thus significant imagery data after being trained on Youtube videos over the course of three days. According to an article on Slate, linear tool-like objects held at about a 30 degree angle were also common features detected and after adding a round of supervised learning, the classifier could detect human faces with around 82% accuracy.

Perhaps not the most useful breakthrough in unsupervised learning on image data on the surface, the model does prove the capability of emerging methods to detect meaningful features in images and video. Perhaps with some tweaking the system will also be able to detect dogs, but no breakthroughs in this area have emerged yet.

Communications and Security: Knowing How Your Code Works

It may seem obvious when said, but knowing exactly how your code works and what it is doing, especially with fringe cases, is a very important aspect of security. When an attacker is able to determine a bug or "feature" that your code has that you are unaware of, it can often be exploited to varying degrees of maliciousness. The well-known website Spotify learned this lesson first-hand when an exploit involving the way they processed usernames was found.

The key mistake made was found to be allowing users to have usernames with any valid Unicode character while having the actual stored username be a more restricted version. The username that was actually stored and checked against for internal purposes was processed with the string.lower() method in Python, which apparently maps a large number of Unicode characters to the 26-character space of lowercase English ascii characters. As a result, it was possible for many different possible usernames to become the exact same username when processed by the lower() method. The attack itself allowed user accounts to be hijacked using this information.

Existing Spotify accounts could be hijacked by signing up for a new account with a username that maps to an existing username when processed by lower(), then submitting a password reset request. The new account creation process would associate the new email address as a change of email address to the old account and thus send the password reset request to that address instead of the one belonging to the original user. Once the password is reset one can log in as the original user with the new password.

While this is a more fringe case of not knowing exactly what code is doing going wrong--which also arguably makes it a more interesting one--it does show the importance of verifying code behavior as well as user inputs for security purposes. Luckily for Spotify the user who discovered this exploit reported it to them quickly, but the results could have been much worse under different circumstances.

Artificial Intelligence: Why You Should Have Paid Attention in Your Statistics Classes

While not a particularly new field in Computer Science, Artificial Intelligence has gone through many changes over the years with many advances and setbacks. As it has become increasingly widespread and marked with new successes over the past decade or two, one thing has become clear: statistics and probabilistic approaches seem to be the key to achieving continues success in just about all facets of the field.

One area with AI that has seen tremendous success from statistical and probabilistic approaches is that of Computational Linguistics. This area has a longtime rivalry between approaches based on hand-generated rules from linguists versus statistical/probabilistic approaches. As Peter Norvig notes, the majority of systems successfully solving a number of problems within Computational Linguistics use statistical and/or probabilistic approaches at least partially if not entirely. These include such popular and widespread problems and application areas as search engines, machine translation, and speech recognition. As new research in this area tends to be towards newer and better statistical and probabilistic approaches, this trend does not seem to be changing anytime soon.

Yet another area with traditionally more formal roots that has benefited greatly from probability and statistics is Graph Theory. The work of Thomas Bayes and Andrey Markov to formulate probabilistic graph-based models is very widespread through all of Artificial Intelligence these days and the applications of such models may be limitless. Such models are used widely in Computational Linguistics, pattern recognition, and Bioinformatics, just to name a few. Such models are capable of encoding the fluctuations, randomness, and uncertainty that are a part of most things we try to model.

Machine Learning is another popular area within Artificial Intelligence that makes heavy use of statistics. A lot of machine learning techniques are actually intended to solve problems from statistics such as linear and logistic regression using new approaches. A number of other parallels exist as well showing the interrelation between Machine Learning and Statistics.

While math in general always seems to be a driving force behind the discovery of new techniques and algorithms in software, statistics and probability in particular are becoming increasingly popular in computer science. It doesn't seem likely that computer science as a discipline will ever break away from its dependence on math, so to the CS majors out there be sure to pay attention in math class!

Sunday, November 17, 2013

History of Computer Science: Artificial Intelligence

“Artificial Intelligence research is an attempt to discover and describe aspects of human intelligence that can be simulated by machines.”
Philip C. Jackson, Introduction to Artificial Intelligence

Artificial Intelligence is a popular industry within technology today, but its history has been a rocky one filled with many transformations and long lulls between rapid booms. The definition itself has even evolved over the years in response to various innovations and setbacks. While the goal of AI has remained fairly constant, our understanding of that goal and innovations towards it have set milestones in its history. Two main types of AI have emerged over the years based on both limitations and advances: strong and weak AI.

Originally, artificial intelligence was the purely theoretical concept of machines that can mimic human learning and thoughts. This is now considered strong AI and while it ultimately remains the goal of the discipline, there is still a long way to go before it can be achieved. In the 1940s, early research involving modeling neural networks structured like mammalian brains helped introduce the discipline and shape further research pursuits. It was the Turing Test in the 1950s, however, that truly marked the popularity and rise in research into strong AI. Alan Turing devised a test that has become the standard for what would be considered (strong) artificial intelligence. A person at a computer terminal, called the interrogator, would be able to chat to two different agents, one human and one the proposed AI system. If the interrogator could not distinguish which agent is human with any significant probability the proposed system would be considered to exhibit artificial intelligence. This is more or less still considered an applicable test for strong AI and other tests have even been derived for it as our understanding of AI and the definition of it has evolved.

Over the next couple of decades some advancements were made in the field, particularly with problem solving applications and testing environments. Computer vision, robotics, and natural language processing in particular had advances in testing environments. By the 1970s, however, the rate of advancement in artificial intelligence simply wasn't keeping up with the demanding expectations held for it at the time. Despite the advances that were made, the practical applications were very rare and interest started to decline.

This lull lasted about a decade, but research continued and by the 1980s the sales in latest hardware and software for AI started to rise as the much needed practical applications for it started to rise. These applications led to the concept of weak AI where systems that were not intelligent in the strong AI sense but involving techniques considered vital to such a system and exhibiting intelligent qualities were used by themselves to solve a variety of interesting problems. Speech recognition and learning systems were two of the first areas that started to see an increase in such practical applications, followed by automatic programming systems that could write themselves based on desired behavior and many of the statistical and graph-based approaches that are widely popular today.

Through the 1990s and into the millennium many new applications and areas emerged, including machine learning, planning systems, and search systems. The use of weak AI techniques and systems has continued to increase to this day and now is very widespread with some major companies such as Google, IBM, and Microsoft leading the way in new research and applications. With this continuing trend, the possibility of strong AI once again emerges as a goal within the field. The hopes and expectations have even risen, such as the concept of The Singularity—a concept attributed to Vernor Vinge in which the intelligence of computers not only meets but exceeds that of humans. It is difficult to say if or when a truly intelligent system will be created, especially as the understanding of what properties such a system would be expected to contain evolves along with the techniques that might lead to such a system, but the field of artificial intelligence is certainly in interesting one that is not likely due for another decline in the foreseeable future.

“AI has the potential to change our world like no other technology.”
M. Tim Jones, AI Application Programming

File Sharing: Changing How Business is Done

What if I told you that you could get all the music, movies, books, and software you want for free? Well, the good news is that nowadays you can quite easily. The bad news is that it's illegal. File sharing is nothing new as far as the Internet is concerned, but over the years the attempts to mitigate it has paled in comparison to its proliferation. Although the term file sharing more generally refers to any context it implies, its usage today almost always refers to pirated content, or at the very least the services that primarily serve such content.

In the early days of content piracy file sharing websites would host files for download. Music was by far the most common type of content pirated then, taking off with the introduction of Napster. Software piracy was also quite widespread through various warez sites, but was not as widespread as music. The problem with file sharing sites, however, was that the content was all centralized in one or a few locations so taking down the sites could easily mitigate the piracy taking place. This ultimately led to the demise of Napster, but not before a new and significantly more resilient framework had already taken hold. Why host the files and tax your bandwidth when you could just as easily pass this on to the users? With this realization, peer-to-peer file sharing was born.

Unlike hosted content, peer-to-peer file sharing serves the content to those who want it from those who already have it. The users simply downloaded a client that handled this all for them and were free to download anything that was available from the other users. Early on clients like Kazaa and Limewire were quite popular, but ultimately faded as newer clients were developed. The critical turning point in this was Torrents, though. While the earlier clients relied on their own servers to coordinate the sharing and search functionality—once again a point of weakness in the system—torrents were a more general framework involving files that contained metadata about the files to download and the servers that were hosting them. This spread the server load across multiple networks and countries making taking them down more difficult as this would have to be done for each network individually. The torrent files themselves were available from multiple sites, gaining the same benefit. Some sites required signing up and often imposed restrictions for doing so to prevent those looking to take down the networks from being to do so as easily. Even multiple clients using the same framework became available eventually. This level of dispersal made taking down the framework next to impossible by those seeking to protect their content that was being pirated.

The various industries affected by file sharing continue to evolve as a result. Some of the most popular sites for downloading torrents such as The Pirate Bay have been constantly targeted with little to no effect as far as the users are concerned. Many industries have looked into or implemented increased security measures to try to combat piracy. Software has implemented activation requirements that track the number of copies a particular serial number has being used at a particular time, but this has done little to help since those pirating software tend to be more computer savvy and thus cracks that patch the software in such a way as to remove this security functionality are quite common. Electronic books attempted to add encryption to control the number of copies and redistribution. This also has been mostly in vain as all encryption schemes to date have been broken and the cost of developing them far outweighs the cost of publishing the book in the first place.

The music industry has by far been affected most by piracy and has gone through the most change as a result. This has a lot to do with it being the most popular type of content to pirate, but also through many unrelated changes and flaws in the business model itself. Music piracy is nothing new, dating back to the days of people copying cassette tapes. The industry itself has for a long time seemed inherently flawed, however, which in part aided the piracy movement. With the introduction of CDs, the music industry underwent a major shift in its pricing practices. Costing next to nothing to make, the industry marked up the price to exorbitant amounts expecting that people would just pay for the higher quality, and for a time they did.

The evolution of the Internet and introduction of file sharing worked to shatter this model, however. After years of price-gouging a large number of people felt little remorse pirating music and the knowledge of just how little of the music prices actually went to the artists—the people they actually did care about—did little to counteract this trend. One could easily support the artist by going to shows and buying merchandise without lining the pockets of the record industry. All the while this trend escalated the price of producing music dropped significantly through advances in technology, making the overpricing practices even more of an insult to music listeners.

These factors played a key role in shaping the music industry and its business model. Realizing that people would no longer pay for overpriced CDs and that the popularity of mp3s was starting to exceed that of physical media the music industry started selling digital copies of music at more reasonable prices and even on a song-per-song basis. How many CDs have you bought where you only liked one or two of the songs anyway? The drop in the price of recording has also been a hit to the record labels while at the same time benefiting music as a whole significantly. Small bands could now record their own studio-quality albums on their own instead of relying on labels. They could even distribute albums themselves through the same digital media outlets as the labels, such as iTunes and Amazon, or even press their own CDs since this cost has always been very low anyway.

The industries affected by piracy have had to adapt to survive, but it could easily be argued that any setbacks arose from flaws in the traditional business models to begin with or the inability to adapt quick enough and in a reasonable manner to a fast-changing world. The distribution of legal content has likewise been affected, in this case very positively as file sharing frameworks such as torrents provide a great framework for the mass distribution of content regardless of the legality of it. Open source software and content can easily be distributed through peer-to-peer file sharing without burdening the already struggling producers of it with the costs of hosting the content themselves. Regardless of one's standpoint on piracy, file sharing has certainly had a huge impact on many industries and how content is acquired. There is no question that it is here to stay and that embracing and adapting to it it is by far the best way to react to this change.

Friday, November 15, 2013

Data Structures: Making Computing Possible

Data is at the heart of modern computing. The majority of what a computer scientist will learn, use, and possibly even research comes down to two things: processing data and storing data. The former is achieved through algorithms and the latter through data structures. Algorithms define the means to achieve particular tasks, generally involving the processing of data, and are quite varied in what they do based on any number of possible problems that need to be solved. Data structures on the other hand, while still just as varied as algorithms, are singular in purpose. All data structures have the common goal of storing and allowing for the access of data, but their variety comes from the types of data and the pursuit of the most efficient ways of storing and accessing that data based on its type and the design of the system.

While there are countless algorithms out there, and in fact any explicit process for achieving a particular task is considered an algorithm, there are only a handful of data structures out there and the discovery of revolutionary new data structures is no small feat or event. They generally have trade-offs in efficiency based on their particular use as well, making knowing them even more important for programmers. The three main types of data structures come in the form of linear data structures, trees, and maps.

Linear data structures are, like the name implies, a stringing together of points along a line. Any piece of data has at most two neighbors to it, one before and one after. The most common types of linear structures are arrays and linked lists, although in concept both are structured the same and the efficiency of various actions is the key difference. Arrays, for example, allow for a particular data value to be looked up very efficiently while linked lists allow for a new value to be placed between two existing values very efficiently. In this case the opposite suffers in performance on the other's task. Linear structures can be nested to create multidimensional structures as well such as matrices, but ultimately the linear structure remains intact for each component.

Trees get their name from the branching out from a common point that they exhibit, much like tree branches. They generally involve some method for subdividing each branch to allow for much more efficient search operations. This, however, requires that the data be related in such a way that predictable comparisons be able to be made to determine which branch to take. If this sort of comparison can't be made and result in the same choice every time the structure will not work. Trees generally have the advantage of being efficient in both insertion and retrieval, which if you recall was a one-or-the-other choice with linear structures. While neither operation is as fast as the superior linear counterpart, the efficiency gain for both far exceeds the trade-offs of the linear structures.

Maps tend to stand alone as data structures in that they fit a particular format of data that trees and linear structures can't handle. They are used when a particular data point is mapped to by a particular associated value or values, commonly referred to as keys. A key is a unique, small value that has data, generally much larger in size, associated to it in such a way that when a particular key is searched for the associated data value is returned. For example, one could use people's names (provided they are unique) as a key to retrieve more extensive data on the person. Maps can be stored in trees where the ordering is determined by the key or can be stored as hashes, most easily thought of as a sparsely-populated array containing the keys so that a particular key can be looked up and its associated data returned more efficiently than the tree counterpart.

The specifics of data structures can be quite complex and provided is only a high level overview of some of the most common types and features, but nonetheless they are a very aspect of computer science and worth learning about for anyone interested in programming. Many books and websites exist on data structures as well as algorithms for more information on the topic and greater detail.

Hacking v2.0

The concept of hackers and hacking has been around for much of the time the Internet has, and much like the Internet itself these terms have evolved over time. Originally hacking referred to gaining unauthorized access to computers and systems, which is a definition that's still relevant today. The drive and spirit behind hacking has broadened the definition as new technologies have emerged and evolved, expanding it to encapsulate the driving force rather than the act itself.

Hacking, especially in its early days, was motivated by one simple driving force: curiosity. The original hackers weren't out to cause trouble or make a profit, but rather to explore systems fully and push the limits of computing itself. Hacking into computer systems was about exploring them and the techniques that provided unauthorized access to them. It is this motivation that is behind the broader definition of hacking today and its widespread usage outside of what was traditionally considered “hacking.”

The expansion of the term has led to not only more widespread usage, but new trends and terminology as well. Long programming competitions—usually 24 or more hours—are gaining popularity and referred to as “hackathons.” These aptly named competitions embody the spirit of exploration and problem solving by encouraging contestants to modify, or “hack,” existing APIs and frameworks to create new products and/or solve challenges. Likewise, modifying hardware to expand or modify the capabilities is also called hacking. You can even “hack” how you live with sites like Lifehacker and Hack-a-Day that encourage innovative solutions to everyday problems both in and out of the technology realm.

The newer usages and terms surrounding hacking are not all that has become widespread, however. The more traditional concept of hacking has also risen in popularity. As any student of strategy can tell you, the best way to combat an enemy is to learn to think like them to adapt to their attacks. It is in this that the traditional sense of hacking has taken off. The security world has become a huge industry that not only employs but also trains hackers to explore and find solutions to the multitude of threats that exist.

Computer and network security experts known as “white hats” learn the very hacking skills they must defend against in a controlled and legal manner via test systems and ultimately by being contracted for penetration testing (the authorized hacking of a system or systems). There is a multitude of courses, certifications, and even competitions that help security professionals learn and hone their skills.

The other side has likewise become a widespread and lucrative business, however. Hackers referred to as “black hats” illegally attack systems—often the very ones that the white hats are trying to defend. Credit card and banking information as well as scams are common revenue streams for black hats, and are lucrative ones at that. One such group was able to make over a billion dollars from selling stolen credit card numbers. As the battle rages on one thing is certain—hacking in every sense of the term is here to stay and will only become more prevalent as more and more of our lives moves to the digital realm.

Monday, October 14, 2013

OPEN SOURCE: Challenging the Traditional Business Model

I'm sure you've heard the expression "you get what you pay for," but in the tech world this is not always the case. Many products, such as Apple computers, are significantly overpriced for what they are while on the flip side there is a huge community of software developers putting out free software that is as good or almost as good as paid alternatives.

Open source software is a major category of free software and a growing trend within the development community. Not all free software is open source, but a large amount of it is. Open source refers to the source code of the software being available for download as well as the software and this enables both collaboration on a massive scale where developers all work together to update and maintain the software as well as using it as a basis for other similar software or as a component within larger software applications.

Open source software is not only good for getting decent software for free, but also a great way to contribute as a developer and get one's name out in the development community. Open source projects can be great items to add to one's resume as well and is seen by many as a way to give back to the software community. There are many that swear by it and would prefer that all software be open source.

While open source software is a great resource, however, it also lacks consciousness as a framework of certain realities, namely that economics is intertwined with the modern world. As such it could never become the sole archetype of software distribution. The harsh truth is that the world operates on money and in the end people need to make money to survive. This is why despite the huge surge of open source software projects being brought to the internet, software companies remain profitable for the most part. There is a lot of useful things that are open source, but in the end the best things are not free. In all likelihood, the majority of contributors to open source projects do so in their spare time from working jobs for profiting companies. This by no means prevents open source from future growth, but rather just prevents it from becoming universal within the development community. Unless of course developers plan on working at fast food restaurants during the day and coding open source projects all night.

Wednesday, October 2, 2013

AGILE: Sprinting Your Way to Productivity

Agile Development Lifecycle It seems that the tech world is inundated with all sorts of buzz-words these days, but many are truly revolutionary and useful ideas. One of the most well-known of these is of course the "cloud" and "cloud computing," but while many companies do leverage this terminology just to seem like they have the hot new commodity, cloud computing as a whole has revolutionized how we do things. A lesser-known but equally revolutionary framework would be Agile. It has challenged decades-old methodologies for software development, namely what's referred to as "waterfall," and come out on top within many companies.

Agile at its core is just that. It emphasizes breaking projects into smaller pieces that can be easily managed and completed in short intervals known as sprints, which typically tend to be one or two weeks long. Each sprint involves completing small components of a much larger project, including testing which is arguable the biggest weakness in waterfall as it can sometimes take longer than the project itself to complete. Agile encourages creating the tests before any development has actually begun, which allows the product to be tailored to the tests rather than the other way around. The tests themselves are created directly from user stories, another key component to Agile.

Agile emphasizes constant communication between all parties involved in a project. This includes management, developers, the customer, and often even end-users. User stories are created that detail exactly what each component should do and why. From these the development team decides on manageable pieces to work on and, as mentioned before, tests that will ensure these requirements are met. While waterfall obviously also must integrate input from the customer, it is generally done at the beginning and a set of specifications are then created from it and the customer has little input until the final product is finished, hopefully ending up something like what was desired.

Since Agile uses short sprints to break the project into smaller components, the customer is constantly giving input and evaluating the work as it progresses. This alleviates the gap between customer wants and the realities of development. Constant input from the customer helps developers understand exactly what is needed and address things that are just not feasible in a timely manner, allowing the product development to be modified before it becomes too difficult to do so. Should a component not meet the customer's expectations or needs, the work lost to redo or modify it is minimized to at most a couple of weeks (the duration of a sprint). As the customer is constantly involved in the project, it also becomes easier to foresee uncertainties about the customer's needs and simply clarify before work even begins.

Agile has been adapted to many different types of projects even outside of the development community. It is a very useful framework and while not necessarily always the answer, it is very smooth and efficient when it is viable for a project. At the very least it is a powerful tool one can add to their list of skills by finding out more and trying it out. For more information, check out a website about Agile.

Friday, September 20, 2013

LinkedIn and Branding: Making Yourself Marketable to Employers

Social networking has taken over the internet and people's lives alike. While sites like Facebook and Twitter allow friends to stay in touch and see every single thing each other does to an almost stalker-like level, the role of social networking for business and anyone with a brand to promote has also been steadily increasing. It should come as no surprise then that LinkedIn, a social networking site designed for professional networking, has also increased in it's role in the business world.

Professional networking allows for one to be exposed to job opportunities, training resources, and events that will allow for further professional networking that would be hard to find out about otherwise. LinkedIn provides the perfect channel for these opportunities by connecting people on the most powerful medium for finding and sharing information about them--the Internet. As such, considering and building one's professional network on LinkedIn is critical for getting ahead in the business community. This is easier said then done, however, since how one goes about this and brands his- or herself can play an important role in finding and leveraging these opportunities.

Making sure your profile is complete and accurate is just the first step of properly branding yourself. This is what potential employers or referrers will see first. Obviously it is important to keep things professional, including your profile picture, to send the right impression. It's also important to make sure to provide details of as much relevant experience and qualifications as possible since this is what job opening opportunities will be looking for. Given the increase in automation, making sure to use proper terminology for these qualifications is also important since chances are a program looking for keywords may be the one doing the initial survey of your profile rather than a person who would be capable of determining what you meant to say. Some companies have even replaced traditional resumes as a way of screening potential hires with LinkedIn profiles, largely due to this automation.

Your profile, regardless of how good you make it, is still only useful if the right people actually see it. It's important to make the right connections on LinkedIn both on- and offline. Make sure you provide your LinkedIn information to anyone who would be a valuable professional networking contact, especially potential employers. Since you are constantly evolving professionally it's important to make these connections even if at the time you may not be what they are looking for since one day your qualifications and the employer's requirements may align and you'll have missed the opportunity otherwise. Even if this doesn't happen, they may know of other job prospects that are suited to your skills and refer you. Job fairs and seminars are a great place to make these connections, so go to as many as possible and be sure to exchange information.

Even within LinkedIn and the Internet as a whole there are opportunities for networking. Searching for and connecting with people who already work for companies and in fields you would be interested in is a good way to make valuable contacts, along with joining groups related to both of these factors as well.[1] Likewise, groups and forums outside of LinkedIn that are of interest to you and related to your field can also be a great place to find professional contacts to network with on LinkedIn. People you already know are important to connect with since even if they aren't in a related field they may have connections that are. They also might not be fully aware of what job interests and qualifications you actually have, which your well-designed profile will tell them.

Everyone wants to find a satisfying and meaningful career that keeps them constantly interested and growing within it. This is easier said than done, but utilizing powerful tools such as LinkedIn to maximize your exposure and connections is a key element to achieving this goal. Even if you don't know what this goal actually is yet, connecting with people who do things that interest you can help you determine what you want to do based on their experiences they relate to you. So go out there and start using your LinkedIn account as much as your Facebook account, although not in the same manner for obvious reasons, and your path to a rewarding career will be underway.

[1] How to Use LinkedIn for Business

Friday, September 13, 2013

QR codes: Security Vulnerabilities Have Never Been Quicker

QR Code Although originally designed for use in manufacturing where larger amounts of data than can be encoded in a traditional single-dimensional barcode was needed, the versatility and amount of data storage that QR Codes allow for has expanded their capability and use greatly. The most common use is undoubtedly to link users to websites via phones capable of reading these codes, in particular smartphones. While any number of other uses exist, from loading contact information to providing access to wireless networks[1], the ability to launch hyperlinks is undoubtedly the most widespread use at least as far as the consumer market is concerned. This, however, can also be problematic as it carries all the same security concerns as clicking on a hyperlink from the web or, perhaps a better analogy as far as security is concerned, email.

Since many instances of QR Codes are found in public areas, they become vulnerable to attack via modification. Although several fairly complicated methods of modifying existing QR codes to do things other than what is intended exist[2], simply creating and attaching and new code seems the most likely and, despite the article's standpoint, easiest attack vector. Nonetheless, several interesting attacks are outlined that include several common url-based attacks such as SQL injection and command injection, as well as well as phone-specific attacks such as using a buffer overflow to compromise the phone or scanning device directly.[2]

Since QR Codes from a security standpoint can best be likened to links found within email, many of the same threats exist. The two most common types of threats that exist within email that could easily be extended to QR Codes are drive-by downloads and phishing scams. Drive-by downloads simply involve sending the user to a site that automatically downloads and installs software, generally malware, to the user's computer or device. These are quite common in email and as mobile device malware continues to become more prevalent QR Codes seem a good candidate for delivery along with traditional email.

Phishing, on the other hand, involves trying to get a user to enter and submit personal information, usually through a website that is designed to mimic a legitimate website. This is often limited simply to email addresses which will undoubtedly be added to spam lists, but can extend as far as online login credential for banks or even account numbers or social security numbers. A perfect example of how a QR Code might be used in phishing would be to attach a fake QR Code to a bank website that directs the user to a clone of the site and captures their login credentials. The sophistication of phishing sites varies, but can range from simply submitting whatever form data you enter as-is to creating a man-in-the-middle site that verifies with the actual bank site that a username exists and even circumventing security measures, for example fetching the Site Key from Bank of America and displaying it on the fake site. Others still will progressively escalate the sensitivity of the data they try to steal by starting out just gaining login credentials, but then asking you to fill out more sensitive data to "verify your account" on subsequent pages.

Some basic security precautions that are recommended when using QR Codes include verifying what URL the code is trying to take you to before actually allowing your device to follow it, including checking the destination URL in the case of shorteners.[1] This of course is easier said than done as most users of QR Codes are after the efficiency that they provide and are unlikely to check these things, especially in the case of shortened URLs. Even in the case of traditional browsers where the URL for a link is generally shown somewhere within the chrome of the browser, users often don't bother looking as is evidenced by the success of malicious sites in general. Even when checking this, knowing how to detect potentially risky URLs is a bit of an art that few people possess. Another security tip requiring diligence on the part of both those generating the QR Codes and those using them is to include in the case of the former and look for in the case of the latter https in the url.[3] SSL is becoming increasingly widespread, breaking out of the traditional use for just logging into an account, as security awareness and concerns become more prevalent, but is still not as recognized or widespread as is probably necessary yet. Security in general required diligence on the part of both those creating and maintaining systems as well as those using them, so the human factor is always they key limitation in security. Creating awareness, however, is the best way to increase security for a better and safer computing experience.

[1] Narayanan, A. Sankara. "QR Codes and Security Solutions." International Journal of Computer Science and Telecommunications 3.7 (2012): 69-72. Print.

[2] Kieseberg, Peter et al. "QR Code Security." SBA Research, unkown year. Print.

[3] Cole, Eric. "URL Shorteners / QR Codes." OUCH! June 2013: 1-3. Print.

Friday, September 6, 2013

Social Networking and security. . .or lack of when promoting your brand.

The world of social networking is a new and evolving one bringing about heated discussions over previously overlooked topics such as privacy and security. As long as there is an internet to find it on it's likely that everything you ever post will stick around and be findable by someone. Don't believe me? Just ask my friend who can't use his brand name anymore after googling it and finding a questionable AIM conversation we had that I posted to a website over ten years ago. This illustrates the first concern social networking and the internet as a whole pose to the security of your brand name: permanence. If you, and employee, or anyone else posts something that casts your brand in a bad light that will be out there and can potentially hurt your brand image. This can be inadvertent or possibly even targeted since your brand isn't the only one out there for whatever market you are in.

There are a vast number of tools out there to help you manage your brand and track how it is received, but those tools are also available to your competition and could potentially be used against you. Let's suppose your brand is Coca-Cola and the evil Pepsi is your competitor. While you use tools to track what people are saying about your wonderful product on social media sites, Pepsi can do the exact same thing for not only their brand, but also for yours. This could allow them to find out what your customers like, dislike, and want out of your brand, which in turn can allow Pepsi to target their advertising or even product to what your customer base wants and steal your customers. Although this is a more indirect threat to your brand security, it still can hurt sales and thus weaken your brand.

Suppose now that the evil brand Pepsi wants to be a little bit more proactive in attacking your brand. Knowing that you likely follow social networking to find out how best to improve your brand, they could start a campaign to bloat your data with false information. It's not hard to sign up for a social networking account and start posting whatever you want about a product or service, so multiply this by a few thousand and you're getting false feedback on how best to improve your brand. Social networking also doesn't do a particularly good job of verifying your are who you say your are, so they could even pose as employees of your fine company and start posting messages that will hurt your brand's image. While it would technically be possible to prove in the long run that the account was not actually of an employee, the damage to your brand at that point would already be done.

The internet has been likened to the Wild West and while this is a large part of the appeal it has, it also means that there are many privacy and security concerns you must be aware of when making a presence on it, both personally and professionally. Your brand needs a presence on the internet and especially within social networking, but protecting and managing it within that context is critical. If you still don't think that the internet can potentially pose a huge threat to your business, go ahead and read this: http://www.kiplinger.com/article/investing/T048-C000-S002-the-truth-behind-penny-stock-spam.html.

Thursday, August 29, 2013

Welcome

I am currently an undergraduate student at San Jose State University majoring in Computer Science and working on a certificate in Computational Linguistics. I work at Barracuda Networks as a Software Engineering Intern where I mostly do programming in Java. In addition to Java, I have a lot of experience in C++, Python, HTML, CSS, and Javascript. Other languages I have used include Ruby and PHP, and I am quick to learn new programming languages. I currently have Associate's Degrees in Computer Science, Mathematics, and Art and Digital Media from Diablo Valley College.

I have taken a number of MOOCs (Massive Open Online Courses) on Coursera and Udacity, most of which relating to various fields within artificial intelligence such as machine learning, natural language processing, probabilistic graphical models, AI for robotics, and game theory in addition to classes in other more general topics such as databases, cryptography, and software as a service. I find the MOOC paradigm very interesting and useful for expanding people's knowledge and bringing education in general as well as specialized topics to those who otherwise wouldn't have access or time for more traditional structured education.

I enjoy competitions involving computer science and have competed in two programming competitions so far and have been the coach for my community college's ACM ICPC teams. I look forward to competing in more, as well as in some capture the flag competitions once I have a more solid knowledge of computer security. The added pressure of competition and unique problems presented are a fun and rewarding challenge that allow me to utilize and hone my skills as a computer science major.

I also try to attend talks on the topic of cyber security when I can as they help increase my knowledge of the security threats that need to be dealt with on a daily basis not only for security professionals, but extending across other fields within and even out of the technology industry. The issue of secure coding practices especially relates to programmers like myself, but other topics are interesting if not relevant. As National Cyber Security Awareness Month is coming in October I look forward to some interesting new offerings in the way of free seminars and talks.

Upon graduation I intend to work in a field utilizing my interests in artificial intelligence or information security, or likely some combination of the two since AI techniques continue to become more prevalent within all of technology. I enjoy problem solving, which seems a prerequisite of being a successful programmer, and enjoy finding unique solutions to problems using my programming knowledge and skill.