Sunday, September 4, 2011

Python solution of Project Euler, problem 6

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

I don't have much to say about this solution. I found this one pretty trivial to implement with Python.  I'm posting my solution just to keep everything in sequence. :)  Basically, the question is asking to find x, where

x = (1 + 2 + .. + 100)2 - (12 + 12 + ... + 1002)
def main():
  N = 100
  sum1, sum2 = 0, 0
  for i in range(1, N + 1):
    sum1 += i
    sum2 += pow(i,2)
  return pow(sum1,2) - sum2

if '__main__' == __name__:
  print main()

Edit: the iterative solution was trivial, but the mathematical solution is much more efficient and clever:

Mathematical Solution (credit to Lov Loothra)
N = 100
s1 = N*(N + 1)/2 # sum of first N numbers
s2 = N*(N + 1)*(2*N + 1)/6 # sum of first N squares
print s1*s1 - s2

Saturday, September 3, 2011

Python solution of Project Euler, problem 5

What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?

I enjoyed working on this problem. It's short, simple, and didn't take very long. I learned nothing new about Python, but I enjoyed it because of the optimizations I realized that brought the execution time to about 0.4 seconds.  The key optimizations are the following:

  1. When checking the divisibility of the numbers [1,20], you really only need to check [11,20] as the [2,10] are implicitly checked (for example, if 20 divides n, we know 10, 5, 4, and 2 divide n, if 14 divides n, we know 7 and 2 divide n, and so on).
  2. We can increment each value by 380 to significantly reduce the number of numbers tried. Why 380? Well, my first guess was to just increment by 20, since it's the highest number that must be divisible by a value, but then I realized that 19 * 2, 19 * 3, ..., 19 * 19 are never divisible by 20, so we increment by 19 * 20 = 380. Does this have to do with the fact 19 is prime? I'm not sure, can somebody prove it? The difference in performance was huge between my first solution, where I incremented 20, and the current solution (about a 1000% increase).

Python Implementation:
def main():
  val, found = 380, False
  while not found:
    # numbers below 11 do not need to be checked because they are already
    # implicitly checked by the higher numbers; e.g., if 20 divides val, so
    # does 10, same for 18 and 9, 16 and 8, etc.
    found = val % 20 == 0 and val % 19 == 0 and val % 18 == 0 and \
        val % 17 == 0 and val % 16 == 0 and val % 15 == 0 and \
        val % 14 == 0 and val % 13 == 0 and val % 12 == 0 and val % 11 == 0
    if not found:
      val += 380
  return val

if '__main__' == __name__:
  print main()

The fact that Python short circuits the conditional after the first failure is important. It may also be more elegant to put the values [20,11] in a range and iterate them, but the iteration introduces more operation and a part of my goal is to make these solutions as efficient as possible (while also being readable and not using too many bad practices).

PHP POST variables getting truncated? Check suhosin

I worked on an issue lately where input from a textarea was getting truncated before it reached the PHP $_REQUEST or $_POST variables. The solution was simple, but it took me way too long to figure out, so maybe this will help somebody from Google one day. 


The site I had this problem with used php-cgi and nginx to serve requests. The the first things I checked where configurations in php.ini and nginx.conf (and other included nginx config files).
  • php.ini  - make sure post_max_size is high enough.  Mine was set to 8M, which seems to be common.
  • nginx.conf - client_max_body_size should also be enough.  This parameter is the maximum number of bytes a client can send to the server per request (I don't think this includes a file upload, may be wrong).
My problem was with suhosin.  Since I wasn't the original architect of this system, I knew nothing about this module.  It is described to "protect servers and users from known and unknown flaws in PHP applications and the PHP core".  


The problem was that the suhosin post.max_value_length parameter wasn't high enough.  By, default, it is 65000 (~63 KB).  I added this parameter to in my .ini config with a high-enough value, restarted php-cgi, and all was good.

Sunday, August 14, 2011

Syntax Highlighter for Blogger

When I first started this blog I knew I'd be posting code snippets.  I think most developers will agree that code is much nicer to look at if it is highlighted nicely. There are dozens of articles out there that tell you how to do this, and I tried a few of them. But they just didn't look right, lines were overlapping, numbers were lined up correct, etc. I realized that one of the top pages from a Google search was from 2009.  The API that he refers to has evolved since.

The library of javascript and CSS written by Alex Gorbatchev, called SyntaxHighlighter. The version released as I'm writing this is 3.0.83.

The first thing to do is include the Javascript and CSS on your page. In your control panel, go to Design > Edit HTML. I put the lines to load the CSS and javascript code below the <title><data:blog.pageTitle/></title> element.







Note that I load Python and Ruby scripts by default, but you'll want to change that to whatever language you want to post code for. You can find the list of supported languages here. If you don't always post code snippets and don't want to load these files on every page of your blog, you can also include the script and link tags in the body of your blog post.

The last thing to do is to make sure that javascript gets invoked on your code. I added this right above </body> in the HTML template.

I put my code in in a <pre /> element. To configure an element for a certain language, use the class attribute. For example:
<pre class="brush: php">
echo "Hello World";
</pre>
There are many other configuration options, see Alex's website for more information.

Saturday, August 13, 2011

Learning Python with Project Euler: Problem 4

This one was pretty simple to find the answer, though I'm not convinced it's nearly as efficient as I could be (I'd love some recommendations!). The only part that tool much thought was the function to determine if a string is a palindrome. What it does is it compare the first and last characters of the string, moving inward if they are equal. The other function just iterates all possible combinations of 2 3-digit numbers, takes their product, and stores it if it is larger than the previously stored value. I just a range that starts at the end and decrements, though it has no advantage over starting from the beginning. It goes from 999 to 99 because the 2nd value isn't inclusive, so the range really stops at 100.

As far as learning python goes, I did have to see how to iterate a list of numbers backwards, ending up with range(start, end, step).

# val - string
def is_palindrome(val):
  val = list(val)
  front, end = 0, len(val) - 1
  while front < end:
    if val[front] == val[end]:
      front += 1
      end -= 1
    else:
      return False
  return True

def solve():
  vmax = -1
  for i in range(999,99,-1):
    for j in range(999,99,-1):
      vint = i * j
      if is_palindrome(str(vint)) and vint > vmax:
        vmax = vint      
  return vmax

print 'Answer: ' + str(solve())

Learning Python with Project Euler: Problem 3

Turns out my original implementation is severely flawed, I was just lucky that it was good enough to work for the number Project Euler asked for...  Stay tuned for another implementation...

So the first two problems were a piece of cake.  Problem 3 requires a little more thinking.  It's necessary to implement an efficient algorithm to determine the largest primes factor of a very large number.  This one took me longer than I am proud to admit.  I think my first implementation was correct, but it's not efficient and will not return an answer in any acceptable time.  My second implementation found the answer in a fraction of a second.  I'm ashamed I didn't get this one sooner!

So what were the challenges to this problem?
  • implement an algorithm to determine if a number n is prime
  • implement an efficient way to determine the largest prime number of an integer
What eventually solved the problem was when I determined where to start checking numbers to see if they are factors and if they are prime.

Bullet point 2 was the kicker for me.  My first implementation was stupid because I decided I would find ALL prime factors of x and select the max of them.  This works fine for small numbers, but not so much for large ones.

import math
import sys

# determine if a number is prime
# precondition: n > 2 and n is odd
def is_prime(n):
  m = 2
  
  # if n is composite, then at least 1 factor is less than sqrt(n)
  max_m = math.sqrt(n) # calculate this value only once
  while m <= max_m:
    if n % m == 0:
      # if m divides n then n is not prime, return false
      return False
    m += 1

  return True

# allow an argument to specify the value of x, otherwise default it to what
# project euler is asking for
x = int(sys.argv[1]) if len(sys.argv) > 1 else 600851475143

# according to Direct Search Factorization [1], numbers greater than the floor
# of the sqrt of x need not be checked (see the article for the reason why)
i = int(math.floor(math.sqrt(x)))

# if i is even, minus 1 so we start at an odd number
if i & 1 == 0: i -= 1

# this loop will break once a prime factor is found
found = None
while i > 1:
  # the order these are checked is VERY important.  if you try the
  # is_prime method first you'll lose efficiency by MANY orders of
  # magnitude (I was too impatient to see if it would even finish running)
  if x % i == 0 and is_prime(i):
    found = i
    break
  i -= 2 # add 2 to avoid even numbers

if found == None:
  print str(x) + ' is prime and has no other prime factors'
else:
  print "largest prime factor: " + str(i)

# [1] http://mathworld.wolfram.com/DirectSearchFactorization.html

Learning Python with Project Euler: Problem 2

This is another short problem in Project Euler. The problem states:

By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.

So, we need to know what the Fibonacci sequence is. The first two numbers in the sequence are 1 and 1. After that, the next number is the sum of the previous two numbers.  Mathematically, we can define this sequence by:  F(n)=F(n-1)+F(n-2), F(0) = 1, F(1) = 1

Below is my Python implementation.  I use a tuple, bitwise and operation, Python's ability to assign multiple variables on the same line.

# only persist the last 2 values of the sequence
vsum, fib_nums = 0, (1, 1)
while fib_nums[0] < 4000000:
  # shift the 2nd value to the 1st position, add the previous 2 values and set
  # it at the 2nd position
  fib_nums = (fib_nums[1], fib_nums[0] + fib_nums[1])
 
  # only add latest fib number if it is even; using bitwise operations we can
  # tell if a number is even or odd by checking the last bit
  if fib_nums[1] & 1 == 0: vsum += fib_nums[1]

print 'sum: ' + str(vsum)

Saturday, August 6, 2011

Learning Python with Project Euler: Problem 1

This one is short and simple. There was really nothing to learn as far as Python goes, the constructs are similar to Ruby, though I'm sure there's a nicer way to write it. I'll also write my Ruby implementation which is a littler shorter. I'll likely try to write more efficient implementations in both languages.

The problem:
Add all the natural numbers below one thousand that are multiples of 3 or 5.

My Python implementation:
vsum, i = 0, 0
while i < 1000:
  vsum += i if i % 3 == 0 or i % 5 == 0 else 0
  i += 1
print 'sum: ' + str(vsum)

My Ruby implementation:
sum = 0
(0...1000).each { |v| sum += (v % 3 == 0 or v % 5 == 0) ? v : 0 }
puts "sum: #{sum}"

Contribution by Lov Loothra (250+ times faster than mine):
import sys
def main():
    N = 999

    #sum of all multiples of 3 < 1000
    sum3 = (N/3*(N - N%3 + 3))/2

    #sum of all multiples of 5 < 1000
    sum5 = (N/5*(N - N%5 + 5))/2

    #sum of all multiples of 3 & 5 (i.e. 15) < 1000
    sum15 = (N/15*(N - N%15 + 15))/2

    #final result
    print sum3 + sum5 - sum15

Introduction

Welcome to my first blog.  I'm tentivly naming is "Coding Problems", though I can't guarantee it'll be strictly about coding and programming, but that'll probably be the main topic.

I have a strong interest in computer science and mathematics. I'm particularly interested in the software engineering side of computer science, where engineers use and develop techniques to increase productivity and the overall quality of their code. I figure by documenting some experiences I have with programming, I'll not only have a nice log of my experiences to look bad on someday, but maybe others will find it helpful or interesting.

I'm mostly skilled in Ruby and Java.  Though I have exposure to several other languages, including but not limited to C, Python, PHP, VB.NET, and Haskell.  I'm very interested in functional languages.  I'd love to become skilled in Haskell someday; what I did taught myself a couple years back was a blast, but unfortunately I didn't stick with it.  I'm also interested in learning Scala, Closure, and/or Groovy.  Professionally I work with Ruby, VB.NET, PHP, Java, MySQL, TSQL, and scripting languages.